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CHAPTER  I 


INTRODUCTION 

Recent  advances  in  grid  generation,  solution  algorithms,  scientific  visualization,  and 
computer  architecture  have  made  it  possible  to  simulate  and  analyze  flow  fields  about 
increasingly  complex  flight  vehicles  using  Computational  Fluid  Dynamics  (CFD).  However, 
there  is  still  a  significant  difference  between  reality  and  the  overall  complexity  of  the  simulations. 
Computational  models  of  flight  vehicles  that  are  often  labeled  as  complex  or  complete  are 
usually  simplified  approximations  to  the  real  vehicle.  Simulation  of  unsteady,  viscous  flow  fields 
about  vehicles  with  real  operating  conditions,  such  as  maneuvering  vehicles,  varying  engine 
conditions,  moving  or  separating  components,  and  separating  stores,  are  typically  considered 
too  demanding  for  current  technology.  In  addition,  tools  available  for  Scientific  Visualization 
analysis  of  the  data  generated  by  time-dependent  simulations  are  not  adequate.  The  underlying 
fluid  mechanics  of  complex  time-dependent  flow  fields  is  not  well  understood,  and  usable 
computational  simulation  and  analysis  tools  could  provide  significant  insight.  There  is  a  real 
need  for  a  capability  to  simulate  and  analyze  comrlex  flow  fields  about  flight  vehicles  with 
realistic  geometry  and  operating  conditions.  This  research  addresses  this  need  for  the  case 
of  flight  vehicle  configurations  with  separating  stores,  moving  or  separating  components,  and 
components  during  maneuvers.  These  applications  are  of  importance  to  many  DoD  agencies 
and  the  aerospace  industry. 

The  primary  objective  of  this  research  project  was  to  produce  a  research  capability  to  perform 
detailed  CFD  simulations  and  Scientific  Visualization  analysis  of  unsteady,  three-dimensional, 
compressible,  viscous  flow  fields  about  flight  vehicle  configurations  of  interest  to  the  Department 
of  Defense  (DoD).  The  target  applications  include,  but  are  not  limited  to,  separation  of  single  or 
multiple  stores  from  aircraft  or  missiles,  launch  vehicle  or  missile  booster  separation,  separation 
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of  crew  escape  modules  or  ejection  seats,  separation  of  fairings,  etc.  from  missile  systems,  and 
maneuvering  aircraft  or  missiles.  Target  flow  conditions  include,  compressible  flow  regimes 
with  vehicles  operating  at  Mach  numbers  from  high  sub-sonic  to  moderate  hyper-sonic,  laminar 
and  turbulent  viscous  flow,  and  flow  of  gases  in  chemical  and  thermal  equilibrium. 

At  Mississippi  State  University,  the  Computational  Fluid  Dynamics  Laboratory  (CFD  Lab) 
at  the  National  Science  Foundation  Engineering  Research  Center  for  Computational  Field 
Simulation  conducts  an  extensive  program  of  application-driven  basic  and  applied  research 
on  computer-based  simulation  and  design  methodology  that  encompasses  grid  generation,  flow 
solvers  and  visualization  using  both  structured  and  unstructured  grid  topologies,  scalable  parallel 
computing,  technology  demonstrations  for  leading-edge  problems,  and  integrated  simulation 
systems  for  design  environments.  The  CFD  Lab  has  brought  together  a  group  of  individuals 
from  a  variety  of  engineering  and  computational  disciplines  for  the  common  goal  of  creating 
a  software  environment  that  is  capable  of  completing  all  tasks  required  for  CFD  analysisf  1  ] . 
The  creation  of  this  multidisciplinary  environment  has  brought  about  the  development  of  a  rich 
suite  of  tools  that  take  the  problem  of  solving  complex  physics  on  complex  configurations  from 
geometry  to  grid  to  solution  through  to  visualization  [2],  [3],  [4].  This  blend  of  individuals  from 
a  variety  of  disciplines  is  a  natural  environment  for  conducting  research  on  the  complex  time- 
dependent  problems  for  this  grant.  The  capabilities  developed  under  this  grant  are  a  direct  result 
of  having  worked  as  a  cross-disciplinary  team  created  to  solve  and  analyze  numerical  simulations 
of  complex  flow  on  complex  geometries. 

As  a  capstone  example  of  the  capability  created  within  the  scope  of  this  grant,  a  simulation 
of  a  strap-on  booster  separating  from  a  Delta  II  launch  vehicle  was  performed.  In  the  overall 
simulation,  flow  about  the  complete  configuration  was  initially  modeled.  Next,  the  strap-on 
booster  by  itself  was  modeled  just  after  separation.  During  this  portion  of  the  time-accurate 
simulation,  the  strap-on  booster  tumbles.  An  integrated  six-degrees-of-freedom  (6DOF)  model 
determines  the  strap-on  booster  kinematics  based  on  the  aerodynamic  loadings.  This  simulation 
required  significant  advances  in  the  areas  of  unstructured  grid  generation,  unsteady  solution 
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algorithms,  and  parallel  implementations.  Scientific  visualization  was  then  used  to  analyze  the 
flow  field  data  from  the  simulation.  The  visual  analysis  of  the  flow  field  was  done  using  a  suite  of 
scientific  visualization  tools  developed  in  part  under  this  grant.  Finally,  a  movie  was  generated 
which  depicts  the  flow  field  physics  and  strap-on  booster  kinematics.  The  movie  tracks  both  the 
motion  and  flow  field  variables  on  the  strap-on  booster  surface  and  in  the  surrounding  field. 

The  research  capabilities  produced  from  this  grant  have  provided  insight  into  the  fluid 
mechanics  of  complex  time-dependent  flow  fields  for  flight  vehicles  with  real  operating 
conditions  and  has  and  will  guide  future  research  in  this  area.  These  tools  significantly 
advance  the  state-of-the-art  in  CFD,  Scientific  Visualization,  and  the  overall  simulation  and 
analysis  capability  available  to  DoD  agencies  and  the  aerospace  industry.  As  an  example  of 
the  capabilities  guiding  future  research,  technology  developed  in  part  under  this  grant  has  also 
been  applied  to  capstone  simulations  of  submarine  maneuvers  and  gust  response  of  a  tilt-rotor 
aircraft. 

The  following  chapters  discuss  the  various  fields  of  study  that  were  investigated  to  meet  the 
objectives  of  this  grant.  Chapter  2  provides  a  background  on  the  grid  generation  algorithms 
that  were  used  to  generate  the  static  portions  of  the  grids.  Chapter  3  provides  the  details  on 
the  relative  body  motion  calculations  and  is  a  companion  chapter  with  chapter  4,  discussing 
the  6DOF  model.  Chapter  5  provides  the  details  for  the  computational  m<  jiodology.  Given 
in  chapters  6  and  7  is  an  overview  of  the  visualization  tools  developed  to  animate  the  time- 
dependent  unsteady  simulation.  Finally,  chapter  8  discusses  the  movie  and  pertinent  results 
produced  from  this  research. 


CHAPTER  n 


GRID  GENERATION 

2.1  Introduction 

Unstructured  grid  technology  is  a  promising  approach  offering  geometric  flexibility  for 
handling  of  both  complex  geometry  and  physics.  As  such,  it  can  provide  a  powerful 
capability  for  accurately  and  efficiently  computing  complex  flow  fields  about  realistic  aerospace 
configurations.  Several  unstructured  grid  generation  and  flow  solver  procedures  have  been 
developed  and  successfully  demonstrated  for  inviscid  flow  about  complex  configurations.  For 
isotropic  elements,  existing  procedures  are  robust  and  capable  of  generating  high-quality  grids 
efficiently.  For  anisotropic  elements  in  viscous  flow  applications,  further  improvements  in 
efficiency,  robustness,  and  quality  of  unstructured  grid  generation  procedures  are  needed.  In  this 
chapter,  a  grid  generation  procedure  is  presented  which  offers  the  potential  for  overall  improved 
performance  and  quality. 

The  most  common  approach  used  to  generate  anisotropic  unstructured  grids  is  to  use  a 
layered  approach  and  generate  points  along  normals  from  solid  boundaries.  Unstructured 
grid  generation  for  viscous  applications  have  been  developed  using  a  modified  advancing-front 
method  by  Hassan,  et  al  [5],  a  semi-structured  approach  by  Lohner  [6],  advancing-normal  by 
Marcum  [7],  and  advancing-layers  by  Pirzadeh  [8].  Hybrid  methods  for  prismatic/tetrahedral 
grid  generation  have  been  developed  by  Kallinderis  [9]  and  Sharov  and  Nakahasi  [10].  While 
all  of  these  methods  differ  in  how  points  and  elements  are  generated,  they  all  produce  very 
structured  and  aligned  elements  adjacent  to  solid  boundaries  and  use  isotropic  tetrahedral 
elements  outside  of  the  anisotropic  or  boundary-layer  region.  Use  of  prismatic  elements  within 
the  anisotropic  region  can  reduce  subsequent  memory  and  CPU  requirements  for  the  flow  solver 
without  any  loss  of  accuracy. 
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The  goal  of  the  present  work  is  to  modify  the  existing  advancing-normal  and  advancing- 
front  local-reconnection  method  [7]  for  efficient  generation  of  high-quality,  mixed  element  type 


Figure  2.1:  Trapped  sliver  element  between  prismatic  groups  of  tetrahedral  elements. 

unstructured  grids  for  viscous  flow  applications.  While  this  method  has  many  advantages,  the 
local-reconnection  process  is  unable  to  remove  trapped  sliver  elements.  Such  elements  can  be 
formed  between  prismatic  groups  of  elements  as  shown  in  Figure  2.1.  Local -reconnection  or 
connectivity  optimization  can  not  remove  these  trapped  slivers  as  they  represent  a  local  minimum 
state  that  cannot  be  removed  without  reconnecting  a  potentially  very  large  number  of  elements. 
An  alternative  is  to  use  the  same  point  placement  strategy  and  discard  the  connectivity  in  favor 
of  a  hybrid  approach.  With  a  hybrid  approach  the  element  connectivity  is  directly  implied.  Also, 
the  elements  can  be  recovered  as  either  all  tetrahedra  or  a  mixture  of  five  and  six  node  pentahedra 
and  tetrahedra.  This  combined  approach  retains  most  of  the  generality  of  the  original  and  is  very 


efficient. 
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2.2  Unstructured  Grid  Generation  Procedure 

The  approach  used  in  the  present  work  uses  the  advancing-normal  point  placement 
algorithm  [7]  to  generate  points  within  the  anisotropic  region.  The  advancing-front/  local- 
reconnection  (AFLR)  procedure  [1 1],[12]  is  used  to  generate  tetrahedral  elements  in  the  isotropic 
legion.  The  basic  steps  in  the  overall  procedure  are  listed  below. 

1 .  Generate  a  boundary  surface  grid. 

2.  Generate  a  volume  triangulation  of  the  boundary  points.  No  boundary  recovery'  is  required. 

3.  Create  new  points  using  advancing-normal  point  placement. 

4.  Attach  new  points  to  the  existing  triangulation  for  searching  and  checking. 

5.  Create  a  new  boundary  surface  grid  using  the  inflated  surface  from  advancing-normal  point 
placement. 

6.  Use  AFLR  to  generate  an  isotropic  tetrahedral  element  grid  for  the  remaining  regions. 

7.  Merge  anisotropic  and  isotropic  regions.  Element  connectivity  within  the  anisotropic 
region  is  directly  determined  from  the  point  ordering. 

2.2. 1  Advancing-Normal  Anisotropic  Grid  Generation 

With  advancing-normal  type  point  placement  for  high-aspect-ratio  elements,  the  standard 
AIT.R  procedure  [7]  does  produce  sliver  elements  of  the  type  shown  in  Figure  2.1.  These 
elements  are  generated  only  in  regions  of  high-aspect-ratio  elements  with  a  very  structured 
alignment.  Elimination  of  these  elements  with  local-reconnection  is  not  feasible.  There  may 
be  no  nearby  optimization  path  which  produces  a  better  connectivity.  The  problem  is  inherently 
due  to  the  very  structured  nature  of  the  grid  in  these  regions.  Only  a  limited  set  of  possible 
triangulations,  that  do  not  contain  sliver  elements,  exists  for  a  set  of  tetrahedra  aligned  in 
prismatic  groups.  A  modified  process  is  proposed  here  which  eliminates  the  sliver  problem  and 
retains  the  generality  and  efficiency  of  the  original  procedure.  In  the  present  approach,  local- 
reconnection  is  not  used  to  determine  the  connectivity  in  these  regions,  Instead,  the  connectivity 
is  directly  determined  by  the  order  in  which  points  were  generated.  This  produces  a  very 
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structured  connectivity  and  allows  the  elements  created  to  either  be  all  tetrahedra  or  of  mixed 
type. 

The  basic  steps  in  the  modified  procedure  are  listed  below. 


1.  De' ermine  a  normal  vector  at  each  active  boundary-layer  point.  Initially  the  normals  are 
based  solely  on  the  original  boundary  surface  geometry.  As  the  generation  advances  the 
normals  are  generated  using  the  geometry  of  the  outer  layer  of  the  boundary-layer  grid. 

2.  Smooth  the  normal  vectors  with  a  weighting  dependent  upon  the  distance  from  the 
boundary.  Initially  the  normals  are  unsmoothed.  At  the  estimated  end  of  the  boundary- 
layer  region  full  smoothing  is  applied.  The  end  of  the  boundary-layer  region  is  based  on  a 
estimate  of  where  the  element  aspect  ratio  will  be  near  isotropic. 

3.  Generate  new  points  one  layer  at  a  time.  New  points  are  created  along  the  normal  vector 
with  the  normal  spacing  determined  using  geometric  growth  from  the  boundary  surface. 
Generation  of  new  points  along  a  normal  is  shown  in  Figure  2.2 

4.  Check  distance  between  new  points  and  surrounding  element  quality.  The  volume 
triangulation  is  used  to  efficiently  check  nearby  points.  As  boundary-layers  merge  new 
points  may  be  too  close  and  advancement  should  terminate  locally.  A  new  point  is  rejected 
if  the  distance  between  it  and  any  nearby  new  (or  existing)  point  is  less  than  a  preset 
fraction  of  the  local  element  length  scale.  Boundary-layer  advancement  is  terminated 
locally  if  a  new  point  is  rejected. 

5.  New  points  are  also  rejected  if  any  of  the  surrounding  elements  that  they  may  produce  fail 
a  quality  check  (maximum  angle  <  160  deg.).  Boundary-layer  advancement  is  terminated 
locally  if  a  new  point  is  rejected  for  quality. 

6.  Active  points  can  become  isolated  as  boundary-layer  advancement  is  locally  terminated. 
This  can  be  prevented  by  rejecting  a  new  point  and  terminating  local  advancement  if  more 
than  some  fraction  of  its  neighbors  have  been  terminated. 

7.  Check  element  aspect-ratio.  As  the  grid  advances  and  the  normal  spacing  increases 
the  element  aspect-ratio  will  eventually  be  isotropic.  Boundary-layer  advancement  is 
terminated  locally  when  the  aspect-ratio  on  the  next  layer  would  be  greater  than  unity. 

8.  Attach  accepted  new  points  to  the  volume  triangulation.  New  points  are  connected  and 
attached  to  the  existing  element  that  contains  them. 

9.  Generate  a  new  boundary  surface  grid  by  inflating  the  previous  surface  at  points  that  have 
continued  to  advance. 

10.  Repeat  steps  1  through  9  until  no  new  points  are  accepted. 
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Figure  2.2:  Advancing-normal  point  placement  along  normal  line  from  boundary  surface. 

2.2.2  Normal  Spacing 

The  normal  spacing  is  determined  using  accelerated  geometric  growth.  The  initial  normal 
spacing  can  be  specifier  globally  or  at  each  boundary  point.  Standard  geometric  growth  is  used 
with  an  accelerated  growth  factor.  The  normal  spacing  is  determined  from 

A.Sn  (2.1) 

an+i  =  rnin(/3an,otmax)  (2.2) 

where  A sn  is  the  normal  spacing  for  layer  n,  an  is  the  growth  factor  for  layer  n,  amax  is  the 
maximum  allowable  growth  factor,  and  /3  is  the  growth  acceleration  factor. 
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2.2.3  Boundary  Normal  Vectors 

A  boundary  normal  vector  is  required  for  the  advancing-normal  procedure.  This  normal  is 
determined  on  the  inflated  surface  during  each  pass.  On  the  first  pass  the  surface  is  the  same  as 
the  original  boundary  surface.  As  only  the  boundary  face  normals  are  unique  (since  planar  faces 
are  used),  some  form  or  averaging  or  optimization  procedure  must  be  used  to  obtain  tne  normal 
vector  at  nodes.  Weighted  averages  can  produce  variations  in  normals  due  purely  to  topology  or 
local  face  area  differences.  In  the  present  work,  a  least-squares  optimization  procedure  is  used 
to  eliminate  those  variations.  An  error  function  is  defined  as 

ej  =  I  -  bi  •  rij  (2.3) 

where  ej  is  the  error  function  for  face  j,  n3  is  the  face  unit  normal  vector  for  face  j,  bi  is 
the  node  unit  normal  vector  for  node  i.  Node  and  face  normals  are  shown  in  Figure  2.3.  The 
error  function  is  also  directly  related  to  the  volume  of  the  element  that  will  be  produced  from 
a  given  face.  Minimizing  the  error  function  also  maximizes  the  element  volume.  Least-squares 
optimization  can  be  used  to  find  6,  such  that  e)  *s  minimized.  The  resulting  equations  are 

£!(*■•  •  »>)»?)  =  E”?  <2-4> 

E[<6i*”i)”5i  =  E”?  <2'5) 

,ni)ni  ]  =  L n)  <Z6> 

where  £  denotes  the  sum  over  all  faces  surrounding  node  i  and  n*,  ny,  and  nz-  are  the  x,  y,  and 
2  components  of  the  unit  normal  vector  for  face  j. 

2.2.4  Element  Connectivity 

The  element  connectivity  for  the  points  created  using  advancing-normal  point  placement 
is  determined  directly  from  the  order  in  which  they  were  created.  The  initial  point  ordering 
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Figure  2.3:  Node  normals  determined  from  surrounding  face  normals. 

is  re-ordered  so  that  optimal  element  quality  will  be  produced  (concave  first  and  convex  last). 
Tetrahedral  elements  are  created  (only  temporarily  if  mixed  elements  are  desired)  for  a  given  new 
point  by  inflating  all  surrounding  boundary  faces  a„  shown  in  Figure  2.4.  Pentahedral  elements 
can  also  be  created  using  points  generated  on  subsequent  layers.  Five  and  six  node  pentahedra 
are  formed  by  combining  tetrahedra  as  shown  in  Figure  2.5.  With  mixed  element  types,  five- 
node  pentahedra  and  tetrahedra  are  created  only  on  the  outer  layer  of  the  anisotropic  region  as 
shown  in  Figure  2.6.  These  elements  are  required  so  that  the  anisotropic  region  is  bounded  only 
by  triangular  faces.  All  of  the  elements  in  the  combined  grid  have  strict  node,  edge,  and  face 
matching  to  each  other  and  to  neighboring  tetrahedral  elements. 

2.2.5  Isotropic  Grid  Generation 

The  AFLR  grid  generation  procedure  used  in  the  present  work  is  a  combination  of  automatic 
point  creation,  advancing  type  ideal  point  placement,  and  connectivity  optimization  schemes.  A 


Figure  2.4:  Creation  of  tetrahedral  elements  using  advancing-normal  point  placement. 


valid  grid  is  maintained  throughout  the  grid  generation  process.  This  provides  a  framework  for 
implementing  efficient  local  search  operations  using  a  simple  data  structure.  It  also  provides  a 
means  for  smoothly  distributing  the  desired  point  spacing  in  the  field  using  a  point  distribution 
function.  This  function  is  propagated  through  the  field  by  interpolation  from  the  boundary 
point  spacing  or  by  specified  growth  normal  to  the  boundaries.  Points  are  generated  using 
advancing-front  type  point  placement.  The  connectivity  for  new  points  is  initially  obtained  by 
direct  subdivision  of  the  elements  that  contain  them.  Connectivity  is  then  optimized  by  local- 
reconnection  with  a  combined  Delaunay  and  min-max  (minimize  the  maximum  angle)  type 
criterion.  The  overall  procedure  is  applied  repetitively  until  a  complete  field  grid  is  obtained. 
Complete  details  are  presented  in  [1 1],  [12]. 


Figure  2.5:  Pentahedral  elements  formed  by  combining  tetrahedral  elements. 

2.3  Application  Examples 

Selected  application  examples  are  presented  here  to  demonstrate  the  capabilities  of  the 
present  procedure  for  generation  of  three-dimensional  unstructured  grids  of  rr  ixed  element  types 
that  are  suitable  for  Reynolds-Averaged  Navier-Stokes  simulations.  All  geometry  preparation 
and  surface  grid  generation  work  was  done  using  SolidMesh  [13]  with  AFLR  surface  grid 
generation  [12]. 

Grid  quality  distributions  and  statistics  are  presented  for  all  examples  in  Figure  2.7.  Element 
angle  is  used  as  the  grid  quality  measure.  The  complete  set  of  grid  quality  data  consists  of 
the  six,  eight,  and  nine  dihedral  angles  for  all  tetrahedra,  five-node  pentahedra,  and  six-node 
pentahedra  respectively.  In  Figure  2.7.  the  distribution  plot  is  in  5  deg.  increments.  As 
shown,  the  distribution  has  peaks  at  60  and  90  deg.  from  the  six-node  pentahedra  and  a  peak 
near  70  deg.  from  the  tetrahedra.  The  results  for  the  examples  presented  are  representative  of 
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Figure  2.6:  Use  of  pyramid  and  tetrahedral  elements  to  transition  between  prismatic  and 
tetrahedral  regions. 

those  obtained  for  a  variety  of  configurations.  Typically,  for  isotropic  elements  in  the  grid,  the 
maximum  element  angle  is  160  cVg.  or  less,  the  standard  deviation  is  17  deg.  or  less,  and  99.5% 
or  more  of  the  element  angles  are  between  30  and  1 20  deg.  And,  for  anisotropic  elements  in 
the  grid,  the  maximum  element  angle  is  170  deg.  or  less  and  99.5%  or  more  of  the  element 
angles  are  between  30  and  135  deg.  The  minimum  angle  is  usually  dictated  by  the  geometry. 
Convex  or  concave  edges  with  an  included  angle  less  than  20  deg.,  such  as  a  sharp  trailing  edge 
or  the  interior  of  a  wedge,  can  produce  larger  maximum  angles  in  the  anisotropic  region.  The 
maximum  anisotropic  element  angle  can  be  controlled  by  specifying  the  maximum  allowable 
angle  (which  eliminates  generation  of  such  elements)  or  by  use  of  multiple  normals  at  convex 
edges3. 

CPU  time  required  on  a  SUN  Ultra  60  workstation  is  presented  in  Table  2. 1  for  each  example. 
Computer  routines  for  the  three-dimensional  grid  generator  are  written  in  C  with  dynamic 
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memory  that  is  automatically  reallocated  based  upon  actual  requirements.  All  floating-point 
calculations  are  performed  using  64  bit  precision  with  8  byte  data.  The  CPU  times  reported  are 
for  one  processor  and  include  all  I/O  and  generation  of  grid  quality  data.  A  boundary  surface  grid 
file  is  the  input.  The  output  includes  a  grid  coordinate  and  connectivity  file  and  a  quality  data 
file.  Memory  required  is  about  300  bytes  per  node  generated.  Requirements  for  memory  and 
CPU  time  vary  with  the  percentage  of  anisotropic  elements,  as  those  requirements  for  anisotropic 
generation  are  considerably  less  than  those  for  isotropic  generation. 

User  input  required  to  generate  a  complete  grid  is  minimal  and  includes  specifying  the  point 
spacing  at  selected  control  points  on  the  boundary  curves  for  surface  grid  generation.  Selection 
of  options  such  as  which  boundaries  to  generate  anisotropic  elements  from  and  initial  normal 
spacing  are  the  only  required  user  input  for  volume  grid  generation.  There  are  no  user  adjustable 
parameters  that  need  to  be  changed  from  case  to  case.  In  all  cases  presented  here,  the  initial 
normal  spacing  was  set  suitable  for  high  Reynolds  number  viscous  CFD  analysis.  Initial  normal 
spacing  was  determined  such  that  the  first  node  adjacent  to  a  viscous  surface  would  have  a  y+ 
value  near  1 . 

Table  2.1:  Number  of  nodes  and  elements  generated  and  CPU  time  required  for  example  cases. 


Case 

Boundary 

Faces 

Nodes 

Five-node 

Pentahedra 

Six-node 

Pentahedra 

Tetrahedra 

CPU  Tim* 
(min) 

EET  Wing 
Body 

272,920 

2,290,661 

33,001 

3,744,105 

2,004,663 

41.9 

Space  Shuttle 
Orbiter 

152,810 

1,102,869 

10,269 

1,709,948 

1,181,232 

16.6 

Titan  IV-A 
Exterior 

337,596 

2,980,493 

25,370 

5,113,326 

59.4 

Titan  IV-A 
Interior 

88,502 

571,399 

31,548 

861,492 

642,859 

5.9 
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2.3.1  Energy  Efficient  Transport  (EET) 

A  grid  suitable  for  high  Reynolds  number  viscous  CFD  analysis  was  generated  for  a  high- 
lift  wing/body  configuration  of  the  Energy  Efficient  Transport  (EET).  The  surface  grid  on  the 
underside  of  the  wing  is  shown  in  Figure  2.8.  Field  cuts  near  the  wing/body,  wing/slat,  and 
wing/tlap  regions  are  shown  in  Figure  2.9.  Element  size  varies  smoothly  in  the  field  and  there 
is  a  smooth  transition  between  the  anisotropic  and  isotropic  regions.  In  narrow  regions  between 
components,  the  number  of  anisotropic  layers  is  reduced  to  produce  high-quality  elements 
between  them.  Also,  the  symmetry  plane  grid  has  been  re-generated  to  match  the  interior 
anisotropic  elements  exactly.  Grid  quality  distributions  are  shown  in  Figure  2.7.  Number  of 
boundary  faces,  nodes,  elements  and  CPU  time  are  presented  in  Table  2.1.  An  all  tetrahedral 
element  version  of  this  grid  was  used  by  Sheng,  et  al  [14]  for  incompressible  flow  simulation 
with  an  implicit  multi-block  flow  solver. 

2.3.2  NASA  Space  Shuttle  Orbiter 

A  grid  suitable  for  high  Reynolds  number  viscous  CFD  analysis  was  generated  for  the  NASA 
Space  Shuttle  Orbiter.  The  surface  grid  is  shown  in  Figure  2.10.  Field  cuts  near  the  rocket 
motor  and  inboard  flap  regions  are  shown  in  Figure  2.1 1.  Element  size  varies  smoothly  in  the 
field  and  there  is  a  smooth  transition  between  the  anisotropic  and  isotropic  regions  In  narrow 
regions  between  components,  the  number  of  anisotropic  layers  is  reduced  to  produce  high- 
quality  elements  between  them.  Also,  the  symmetry  plane  grid  has  been  re-generated  to  match 
the  interior  anisotropic  elements  exactly.  Grid  quality  distributions  are  shown  in  Figure  2.7. 
Number  of  boundary  faces,  nodes,  elements  and  CPU  time  are  presented  in  Table  2.1. 

2.3.3  Titan  IV-A  Launch  Vehicle 

A  grid  suitable  for  high  Reynolds  number  viscous  CFD  analysis  was  generated  for  a  Titan 
IV-A  launch  vehicle  wind  tunnel  test  model  configuration.  This  configuration  includes  two 
strap-on  solid  rocket  motors  (SRM),  thrust  vector  control  (TVC),  stage  separation  motor  (SSM), 
interstage  cavity,  and  wind  tunnel  test  model  sting.  The  overall  configuration  and  surface  grid 
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near  the  SRM,  TVC  and  SSM  regions  are  shown  in  Figure  2.12.  Field  cuts  near  the  SRM,  TVC, 
and  SSM  regions  are  shown  in  Figure  2.13.  A  field  cut  within  the  interstage  cavity  is  shown  in 
Figure  2.14.  Element  size  varies  smoothly  in  the  field  and  there  is  a  smooth  transition  between 
the  anisotropic  and  isotropic  regions.  In  narrow  regions  between  components,  the  number  of 
anisotropic  layers  is  reduced  to  produce  high-quality  elements  between  them.  Grid  quality 
distributions  are  shown  in  Figure  2.7.  Number  of  boundary  faces,  nodes,  elements  and  CPU  time 
are  presented  in  Table  2.1.  This  configuration,  and  the  others  presented  here,  are  representative  of 
the  high  level  of  geometric  complexity  that  can  be  handled  routinely  using  the  present  approach. 

2.4  Summary 

A  procedure  has  been  presented  for  efficient  generation  of  high-quality  unstructured  grids  of 
mixed  element  types  suitable  for  CFD  simulation  of  high  Reynolds  number  viscous  flow  fields. 
Layers  of  anisotropic  elements  are  generated  by  advancing  along  prescribed  normals  from  solid 
boundaries.  The  points  are  generated  such  that  either  pentahedral  or  tetrahedral  elements  with  an 
implied  connectivity  can  be  be  directly  recovered.  As  points  are  generated  they  are  temporarily 
attached  to  a  volume  triangulation  of  the  boundary  points.  This  triangulation  allows  efficient 
local  search  algorithms  to  be  used  when  checking  merging  layers.  The  existing  AFLR  procedure 
is  used  to  generate  isotropic  elements  outside  of  the  anisotropic  region.  Results  were  presented 
for  a  variety  of  applications.  The  results  demonstrate  that  high-quality  anisotropic  unstructured 
grids  can  be  efficiently  and  consistently  generated  for  complex  configurations. 


a.  wing  region 


b. 


c.  wing  region 


Figure  2.9:  Field  cuts  for  EET  grid 


Figure  2.10:  NASA  space  shuttle  orbiter  surface  grid. 


a.  rocket  motor  region 


b.  inboard  flap  region 


Figure  2.1 1 :  Field  cuts  for  NASA  space  shuttle  orbiter  grid. 


c.  SSM  region 


Figure  2.13:  Field  cuts  for  Titan  IV- A  launch  vehicle  external  grid. 


CHAPTER  HI 


RELATIVE  BODY  MOTION 

3.1  Introduction 

Unsteady  simulations  for  moving  geometries  typically  require  a  body  conforming  grid  at  all 
times.  If  the  bodies  in  the  flow  field  undergo  arbitrary  movement,  a  fixed  grid  will  lead  to 
badly  distorted  elements  which  will  result  in  convergence  difficulties  and  poor  quality  results. 
Remeshing  must  be  carried  out  in  order  to  have  a  body  conforming  grid.  One  option  is  to 
do  a  global  regeneration  which  is  an  expensive  process  and  can  degrade  the  accuracy  due  to 
accumulation  of  interpolation  errors. 

Unsteady  flows  with  moving  bodies  are  often  handled  using  adaptive  remeshing  [15,  16]  of 
regions  undergoing  rapid  changes.  It  has  been  observed  that  frequent  adaptations  may  lead  to 
poor  numerical  results  and  also  can  result  in  loss  of  essential  physical  features  of  the  flow  [17], 
The  poor  performance  is  due  to  the  interpolation  of  data  from  one  grid  to  another.  In  order 
to  overcome  this  loss  of  information  and  to  increase  the  efficiency,  local  remeshing  [17]  can 
be  carried  out  in  the  vicinity  of  highly  distorted  elements.  A  typical  simulation  may  require 
several  regenerations.  In  order  to  carry  out  realistic  computations,  a  fast  remeshing  capability  is 
required. 

Identification  of  the  region  of  grid  deformation  is  an  important  task  in  the  dynamic  grid 
generation  procedure.  One  approach  is  the  adaptive  window  procedure  presented  by  Singh 
et  al  [18].  Windows  are  created  by  specifying  a  normal  distance  from  the  body  of  interest. 
The  entire  domain  is  searched  to  locate  the  points  that  fall  within  the  window  and  are  flagged 
as  window  points,  which  is  quite  expensive.  The  window  points  are  considered  as  a  spring 
network  and  are  allowed  to  adapt  to  the  body  movement.  Tension  spring  analogy  is  a  popular 
technique  that  has  been  used  to  solve  moving  body  problems  [18],  [19]  and  has  been  proven 
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successful  for  problems  with  small  scale  deformation  [19].  For  large  deformation,  the  spring 
stiffness  is  critical  and  crossing  of  grid  lines  may  be  encountered  since  the  connectivity  has  to 
be  maintained  at  all  times.  Finite  element  methods  have  also  been  used  to  solve  the  moving 
body  problems  [17,  20],  An  arbitrary  Lagrangian-Eulerian(ALE)  formulation  has  been  used  for 
solving  tiansient  problems  with  large  scale  deformations  [20]  in  which  the  coordinates  can  move 
in  an  arbitrary  way. 

Most  of  the  previously  described  methods  were  tested  on  inviscid  grids.  Many  practical 
situations  involve  solution  of  viscous  flows,  which  requires  a  high  aspect  ratio  grid  close  to  the 
body.  In  this  case,  the  problem  becomes  more  severe  since  the  deformation  of  the  high  aspect 
ratio  elements  will  result  in  a  poor  quality  grid  with  slight  movement  of  the  bodies.  Hence,  we 
need  a  general  method  that  can  handle  both  viscous  and  inviscid  grids  with  arbitrary'  motions. 
Another  real  challenge  for  the  dynamic  mesh  algorithm  is  in  its  ability  to  handle  the  relative 
motion  of  the  bodies  in  close  proximity.  In  this  case,  badly  distorted  elements  are  encountered 
with  minimal  deformation.  Frequent  remeshing  is  required  to  maintain  grid  quality. 

In  the  present  study  a  general  dynamic  unstructured  grid  algorithm  is  presented,  which  can 
handle  arbitrary  motion.  An  efficient  procedure  for  the  identification  of  the  window  region  is 
developed.  The  problem  of  handling  the  relative  motion  of  bodies  in  close  proximity  has  been 
addressee,  bv  developing  a  local  marching  procedure.  Results  are  presented  fo'.  mixed  element 
type  grids  to  demonstrate  the  efficiency  of  the  present  algorithm.  Topological  changes  are  not 
considered  in  the  present  research. 


3.2  Grid  Generation 

Grid  generation  for  a  given  geometry  is  carried  out  using  the  Advancing-Front/Local- 
Reconnection(AFLR)  [21,  12,  22]  method.  The  AFLR  procedure  uses  a  combination  of 
automatic  point  creation  and  advancing  type  ideal  point  placement.  A  point  distribution  function 
is  assigned  to  each  boundary  point  based  on  the  local  spacing.  The  growth  rate  normal 
to  boundary  is  also  specified  for  each  boundary  point.  The  point  distribution  function  is 
propagated  through  the  field  by  interpolation.  Points  are  generated  using  advancing-front  type 
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point  placement  for  isotropic  elements  [21],  and  advancing-normal  type  point  placement  for 
high-aspect  ratio  elements  [22].  Initial  connectivity  is  obtained  using  direct  subdivision.  A 
valid  grid  is  maintained  throughout  the  grid  generation  process.  The  grid  connectivity  is  then 
optimized  using  iterative  local-reconnection  process  subject  to  a  quality  criterion.  A  combined 
Delaunay /min-max  type  criterion  is  used.  The  overall  procedure  is  repeated  until  the  entire  field 
grid  is  generated  with  desired  point  spacings. 

For  complex  geometries  a  mixed  element  type  grid  is  generated  with  pseudo-structured  elements 
in  the  regions  where  the  geometry  is  smooth.  To  improve  the  efficiency  of  the  flow  solver 
unstructured  tetrahedral  elements  are  used  elsewhere.  The  advancing  normal  type  point 
placement  used  for  the  generation  of  the  high-aspect  ratio  elements  inside  the  boundary  layer 
leads  to  structured  type  elements.  The  elements  inside  the  boundary  layer  are  combined  to  form 
mixed(pentahedral  and  tetrahedral)  elements  [22]. 

The  dynamic  grid  generation  process  can  involve  a  number  of  regenerations.  Therefore,  an 
efficient  grid  generation  procedure  is  extremely  important  for  a  moving  body  simulation.  The 
AFLR  procedure  has  a  demonstrated  ability  to  generate  high  quality  grids  about  geometrically 
complex  configurations  for  a  variety  of  applications  [12,  22].  Also  this  procedure  is  highly 
efficient  and  robust,  thus  providing  a  direct  contribution  to  the  efficiency  of  the  overall  dynamic 
grid  generation  algorithm. 


3.3  Window(Deforming  Region)  Identification 

Identification  of  windows  corresponding  to  the  moving  bodies  plays  a  significant  role  in  the 
dynamic  grid  generation  process.  The  ability  of  the  procedure  to  identify  this  region,  has  an 
immediate  consequence  in  the  overall  efficiency,  since  the  windows  will  have  to  be  recreated 
many  times  during  the  simulation.  The  first  step  in  the  present  dynamic  grid  generation  algorithm 
is  to  form  the  protected  layers  corresponding  to  all  the  stationary  surfaces  in  a  given  grid.  This 
is  done  in  order  to  avoid  the  intersection  of  body  surfaces  and  to  preserve  the  boundary  layer 
of  the  stationary  bodies.  Rigid  layers  are  formed  corresponding  to  each  moving  body,  that  are 
allowed  to  move  with  the  body.  If  a  moving  body  has  a  viscous  boundary  condition  then  the 
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entire  boundary  layer  is  allowed  to  move  with  the  body  in  a  rigid  fashion.  This  prevents  the 
distortion  of  the  high  aspect  ratio  boundary  layer  elements  and  preserves  a  high  quality  grid. 
Windows  are  identified  outside  the  rigid  layers  and  the  nodes  that  fall  inside  this  window  region 
are  allowed  to  adapt  to  the  body  movement. 

3.3.1  Marching  Procedure 

The  protected  layers,  rigid  layers  and  windows  are  formed  by  using  a  marching  procedure.  The 
list  of  elements  surrounding  a  node  is  built  using  the  grid  connectivity  information.  The  nodes 
that  appear  on  the  moving  body  surface  are  tagged  as  surface  nodes.  Marching  is  carried  out 
from  the  surface  using  the  list  of  elements  surrounding  a  node.  The  nodes  of  elements  that  do 
not  belong  to  the  current  layer  are  tagged  as  nodes  corresponding  to  the  next  layer.  Marching 
is  continued  until  a  specified  number  of  layers  are  identified.  This  procedure  is  highly  efficient, 
since  marching  is  carried  out  locally  from  only  those  nodes  that  are  tagged  as  belonging  to  the 
surface  of  interest.  In  the  case  of  mixed  element  type  grids  the  window  region  contains  only  the 
tetrahedral  elements  as  all  the  mixed  elements  occur  inside  the  boundary  layer.  The  end  of  the 
rigid  layer  and  that  of  the  window  region  define  the  boundaries  of  the  deforming  region.  These 
boundaries  form  a  valid  surface  grid  used  when  the  deforming  region  undergoes  regeneration 
ba'jed  on  the  quality  criterion.  The  outer  boundary  of  the  window  forms  the  interface  between 
the  deforming  and  non-deforming  regions. 

3.3.2  Local  Marching 

Modeling  of  practical  geometries  often  includes  bodies  in  close  proximity  with  narrow  gaps 
e.g.  launch  vehicle  geometries  with  strap-on  boosters.  The  number  of  layers  generated  between 
the  bodies  that  are  very  close  is  hence  restricted  by  the  size  of  the  gap.  Uniform  marching  in  such 
gaps  will  result  in  either  the  intersection  of  the  boundary  layer  or  the  surface  of  the  neighboring 
body.  Therefore  a  local  marching  procedure  has  been  developed.  Marching  is  stopped  locally 
on  any  region  if  a  node  comes  in  contact  with  the  nodes  that  are  already  marked.  This  allows  the 
layers  to  grow  only  in  those  regions  where  elements  are  available  for  marching.  Intersection  of 
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the  window  regions  is  another  problem  that  is  encountered  when  moving  bodies  are  close  to  each 
other.  This  problem  is  treated  by  identifying  a  common  window  for  all  bodies  whose  windows 
intersect.  The  common  window  is  used  until  the  bodies  move  sufficiently  far  to  establish  their 
independent  windows. 


3.4  Grid  motion  and  Remeshing 

Grid  motion  has  to  be  carried  out  in  order  to  follow  the  moving  boundary.  The  motion  is 
implemented  in  two  steps.  The  first  step  requires  the  rigid  layers  attached  to  the  body  to  move 
with  the  body  to  which  they  are  associated.  This  helps  in  keeping  the  grid  undistorted  close  to 
the  surface.  Deformation  is  carried  out  in  the  grid  region  inside  the  window  in  such  a  way  that 
element  distortion  is  minimized.  This  is  achieved  by  calculating  weights  associated  with  the 
nodes.  All  the  nodes  corresponding  to  moving  surfaces  and  the  rigid  layers  have  a  weight  of  one 
and  the  nodes  in  the  non-deforming  region  and  the  nodes  corresponding  to  all  other  stationary 
surfaces  will  have  a  weight  of  zero.  The  weights  of  the  nodes  inside  the  window  region  varies 
smoothly  from  one  to  zero  as  shown  in  Figure  1. 

After  each  deformation  the  grid  quality  of  the  window  region  is  checked  by  calculating  the 
dihedral  element  angle.  A  low  quality  element  will  have  an  angle  close  to  1 80  degrees.  Elements 
with  an  angle  of  170  degrees  are  typically  acceptable.  In  cases  when  the  bodies  are  in  close 
proximity,  the  maximum  angle  of  the  undeformed  window  region  is  used  for  the  quality  criterion. 
In  addition  to  the  angle  criterion,  the  element  volume  is  checked.  If  a  given  element  has  a 
volume  which  differs  from  its  original  volume  by  more  than  a  factor  of  three  then  the  quality 
criterion  is  not  considered  satisfied.  A  volume  ratio  of  three  is  considered  acceptable  during  the 
deformations. 

If  the  quality  criterion  are  not  satisfied,  a  local  regeneration  of  the  window  region  is  carried  out. 
Surface  reconnection  is  not  allowed  during  this  regeneration  process  since  that  would  effect  the 
connectivity  of  the  non-deforming  portion  of  the  grid.  The  regenerated  grid  is  then  put  back  to 
form  a  new  grid.  The  new  grid  is  checked  to  make  sure  that  there  are  no  zero  volume  elements. 
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Deformation  is  continued  after  recomputing  the  weights  and  windows.  When  two  bodies  are 
in  close  proximity,  this  procedure  could  result  in  elements  with  zero  volume  on  the  interface 
between  the  rigid  layers  and  the  regenerated  grid.  This  occurs  due  to  the  restriction  placed 
on  the  surface  grid  that  no  reconnection  is  allowed.  Under  such  conditions  a  second  level  of 
regeneration  is  carried  out  by  remeshing  the  grid  from  the  body  surface  to  the  end  of  the  v.  mdow 
region.  If  the  moving  body  has  a  viscous  boundary  condition,  the  boundary  layer  is  regenerated. 
The  overall  grid  quality  is  maintained.  In  some  cases,  it  may  also  be  possible  to  eliminate  the 
zero  volume  elements  by  reconnecting  the  elements  in  the  fixed  region.  Results  presented  in  the 
next  section  shows  the  efficiency  of  the  present  algorithm,  which  is  mainly  due  to  the  fact  that 
the  regenerations  are  localized. 

3.5  Results 

To  validate  the  present  dynamic  grid  generation  algorithm  a  launch  vehicle  test  case  with  three 
boosters  is  considered  This  configuration  consists  of  bodies  in  close  proximity  which  poses  a 
real  challenge  for  the  dynamic  grid  generation  process.  Details  of  the  grid  generation  of  this 
geometry  can  be  found  in  Ref  [12].  A  viscous  grid  is  generated  for  the  given  configuration 
using  the  AFLR  technique  with  mixed  element  types  inside  the  boundary  layer.  The  initial  grid 
consists  of  390,332  nodes  with  628,827  tetrahedron,  4/39  five-node  pentahedron(pyramid‘)  and 
541,903  six-node  pentahedron(prisms).  Rigid  layer  and  window  region  statistics  corresponding 
to  the  moving  booster  are  shown  in  Table  1.  The  CPU  time  reported  is  the  time  to  generate  the 
initial  maps  for  the  rigid  and  window  regions. 


Table  3.1 :  Rigid  layer  and  window  region  statistics 


No.  of 
layers 

%  nodes 

%  tets 

%  pyramids 

%  prisms 

CPU 

(sec) 

Rigid 

layers 

17 

12.022 

0.136 

20.20 

15.922 

2.41 

Window 

region 

5 

1.876 

10.449 

0 

0 

2.05 

10 

4.739 

22.327 

0 

0 

3.62 
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The  computations  are  carried  out  on  one  processor  of  a  Sun  HPC  10000.  In  order  to  bring  out 
the  effect  of  the  number  of  layers  in  the  window  region  on  the  quality  of  the  deformed  grid  and 
the  efficiency  of  the  algorithm,  a  detailed  study  is  carried  out  by  considering  both  five  and  ten 
layers  in  the  window  region.  The  distribution  of  the  weights  in  the  window  region  for  the  initial 
grid  with  five  and  ten  layers  is  shown  in  Figures  la  and  lb  and  the  final  position  in  Figures  lc 
and  Id.  It  can  be  seen  from  the  figures  that  the  small  gap  between  the  main  body  and  booster 
restricts  the  growth  of  the  window  region  and  therefore  a  local  marching  procedure  is  used  for 
the  window  identification. 

The  dynamic  grid  generation  is  carried  out  by  allowing  one  of  the  boosters  to  follow  a 
translational  motion.  A  AX  of  0.025  is  used  for  the  deformation.  The  overall  length  of  the 
moving  strap-on  booster  is  47.6.  500  deformation  steps  are  carried  out.  It  is  observed  that, 
with  the  increase  in  distance  between  the  main  body  and  the  booster,  there  is  an  increase  in  the 
number  of  elements  and  nodes  in  the  window  region.  After  the  booster  has  moved  far  enough  to 
establish  a  fully  developed  window  region,  a  decrease  in  the  number  of  elements  and  nodes  in 
the  window  region  is  noted.  The  number  of  elements  in  the  window  region  ranged  from  65646 
to  82728  in  the  five  layer  case.  Local  regeneration  of  the  window  region  is  carried  out  when  the 
the  quality  criterion  is  violated.  The  details  of  the  simulation  are  presented  in  Table  2.  The  total 
CPU  time  listed  is  for  all  grid  work  during  the  Emulation,  including  grid  motion,  remeshing, 
regeneration  of  layer  maps  and  calculation  of  weights. 


Table  3.2:  Remeshing  data  for  strap-on  booster  separation  simulation 


H 

Total  CPU 
(sec) 

AFLR 

CPU 

Other 

CPU 

No.  of  local(AFLR) 
remeshing(level  1 ) 

5 

500 

3268.65 

281.58 

2987.07 

27 

10 

500 

2645.9 

230.79 

2415.11 

7 

Other  than  the  AFLR  code  for  mesh  generation,  the  present  code  is  not  optimized.  With  a  fully 
optimized  code,  a  good  improvement  in  the  efficiency  of  the  algorithm  is  anticipated.  Presently, 
several  of  the  required  maps  are  generated  globally  rather  than  locally  for  convenience.  Field 
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cuts  of  the  deformed  grid  with  five  and  ten  layers  in  the  window  region  are  shown  in  Figures  2 
and  3.  Figure  2  shows  that  the  moving  booster  leaves  a  trail  near  the  tip  as  it  moves.  With  ten 
layers  in  the  window  region  the  trail  is  cleared  and  it  can  be  seen  from  Figure  3  that  the  deformed 
grid  is  of  good  quality.  Figure  4  shows  the  field  cut  of  the  final  grid  with  two  moving  boosters 
and  five  layers  in  the  window  region. 

The  results  of  the  present  study  show  that  an  increase  in  the  number  of  layers  in  the  window 
region  results  in  a  reduction  in  the  number  of  local  regenerations,  and  a  corresponding  increase 
in  efficiency.  Accuracy  of  the  overall  flow  simulation  algorithm  should  also  increase  due  to 
reduced  interpolation  errors. 


3.6  Conclusions 

A  general  dynamic  grid  algorithm  capable  of  handling  arbitrary  relative  motion  of  multiple 
bodies  is  presented.  A  robust  procedure  for  identifying  the  windows  is  developed  and  the 
treatment  of  bodies  in  close  proximity  is  addressed  by  using  a  local  marching  procedure. 


Figure  3.1:  Window  region  a.  5  layers  initial,  b.  10  layers  initial,  c.  5  layers  final,  d.  10  layers 
final 


Figure  3.2:  Field  cut  after  500  steps,  with  5  layers  in  window  region 


Figure  3.3:  Field  cut  after  500  steps,  with  10  layers  in  window  region 


Figure  3.4:  Field  cut  after  500  steps,  with  10  layers  in  window  region 


CHAPTER  IV 


SIX  DEGREE  OF  FREEDOM  (6DOF)  MODEL 

Prediction  of  the  trajectory  of  the  moving  body  requires  the  coupling  oi  the  fluid  dynamic 
equations  and  the  solid  body  equations.  The  application  of  Newton’s  second  law  to  moving 
bodies  results  in  the  six-degree-of-freedom(6DOF)  equations.  Details  of  the  derivation  of  the 
equations  can  be  found  in  Stevens  et  al  [23].  Also,  a  detailed  discussion  of  the  equations  of  rigid 
body  motion  can  be  found  in  [24], 

Two  sets  of  coordinate  systems  are  used  in  this  model.  The  body  fixed  system  is  rigidly 
attached  to  the  moving  body  with  the  Center  of  Gravity(CG)  as  its  origin.  There  is  an  inertial 
coordinate  system  to  which  the  position  and  orientation  of  the  body  is  referenced.  B(t)  is  the 
rotation  matrix  that  takes  vectors  from  the  inertial  to  the  body  coordinate  system. 

The  state  variables  of  the  model  consists  of  three  components  of  each  position  vector 
p  =  (x,  y,  z)T,  the  translational  velocity  coordinates  vq  =  {u.  u,  w)T  and  the  angular  velocity 
coordinates  cos  =  (P,Q,R)T.  A  differential  equation  is  needed  for  time-varying  transformation 
matrix  B,  which  will  lead  to  four  additional  equations  for  the  attitude  in  terms  of  the  quaternions 
Q  =  (9o,9i,92,  93)T- 

One  set  of  state  equations  is  obtained  by  writing  p  in  terms  of  its  components 

p  =  Btvb  +«EXp 


or 


p  =  BTvg  +  QeP 


(4.1) 
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where  u>e  is  the  absolute  angular  velocity  of  Earth’s  rotation.  The  symbol  fi  is  used  to  denote 
the  cross-product  matrix  corresponding  to  the  operation  (wx)  . 

Newton’s  second  law,  applied  to  the  translational  motion,  relates  force  to  rate  of  change  of 
linear  momentum  which  is  given  by 


Fb  +  Bmg  =  m-^—[vB  +  B(ue  *  p)]  (4.2) 

at  i 

where  Fb  is  the  force  in  the  body  coordinate  system,  Bmg  is  the  gravity  force  rotated  into  body 
frame  by  rotation  matrix  B,  and  mvB )  is  the  time  rate  of  change  of  linear  momentum  of  the 
body  with  respect  to  the  inertial  coordinate  system.  Expansion  of  this  last  term  results  in 


-7—  [ttivb  +  B(ue  x  p)]  =  ( VB  +  WfiX  Vg)  +  B(ue  X  p) 
at  i 


(4.3) 


Rearranging  the  Eq.(4.2),  using  Eqs.(4. 1,4.3)  the  state  equation  for  the  translational  velocity 
becomes 


Fb 

vb  = - (Cob  +  Bue)  X  t +  B[g  -  Cje  x  (Coe  x  p)] 

m 


(4.4) 


The  angular  accelerations  are  obtained  by  applying  Newton’s  second  law  to  the  rate  of  change 
of  angular  momentum  of  the  body.  The  angular  equation  of  motion  is  given  by 


Tb  =  Hb )  (4.5) 

Tb  is  the  net  torque  acting  about  the  body  CG,  Hb  is  the  angular  momentum  of  the  rigid  body 
and  the  time  derivative  is  with  respect  to  the  inertial  coordinate  system.  Expansion  of  the  time 
derivative  term  results  in 


-^-(Hb)  =  Sb+ubxHb 


(4.6) 


The  angular  momentum  is  given  by 
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Hb  =  J&B 


where  J  is  the  matrix  '  f  moment  of  inertia 


j XX 

Jxy 

-Jxz 

~Jxy 

Jyy 

-Jyz 

~Jxz 

—  Jyz 

Jzz 

The  entries  of  the  inertia  matrix  are  computed  as  follows 
Moment  of  inertia  about  the  x-axis  is  given  by 


(4.7) 


Jxx  =  j{y2  +  z2)drn 
Cross-product  of  inertia  is  given  by 


(4.8) 


The  state  equation  for  the  rotational  motion  is  given  by 

b  =  -J~1{3b  x  {J£b))  +  J~1Tb 


(4.9) 


(4.10) 


A  four- variable  attitude  propagation  in  terms  of  the  quaternions  is  considered  in  the  present 
work  to  determine  the  orientation  of  the  body.  The  three-variable  attitude  propagation  in  terms 
of  Euler  angles  has  some  disadvantages.  When  the  pitch  angle  6  reaches  ±90  degrees,  a  division 
by  zero  is  encountered.  Also,  Euler  angles  may  integrate  up  to  values  outside  the  the  normal 
range  of  the  pitch,  roll  and  yaw  angles  [23].  Therefore  it  is  difficult  to  determine  the  attitude 
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uniquely.  Four-variable  attitude  propagation  overcomes  this  problem.  The  differential  equation 
for  the  quaternion  parameters  is  given  by 


Q=-^qQ  t4.ll) 

4.1  The  Round-Earth  Equations 

For  a  complete  state  model  the  relevant  state  equations  are  assembled  in  a  matrix  form 
(Ref.  [23])  as  follows 


p 

fi  E 

bt 

VB 

-bq2e 

—  (fi  B  +  BQe) 

Wfi 

0 

0 

9 

0 

0 

The  state  vector  XT  =  jF ,  ,  Ug ,  q1'^ 

the  position  of  the  body  at  any  given  instant. 


0 

0 

p 

0 

0 

0 

V 

+ 

BS+% 

J-'SIbJ 

0 

LJ 

J-'Tg 

0 

1 

O’ 

G 

t-h|CN 

1 

9 

0 

(4.12) 


contains  13  elements  and  it  completely  determines 


The  coefficient  matrix  consists  of  submatrices 
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0  P  Q  R 
-P  0  -R  Q 
-Q  R  0  -P 
-R  -Q  P  0 


B  = 


%  +  9i  ~  ?2  -  % 
2(?i?2  -  9o9s) 
2(<?i<73  +  Q0Q2) 


2(91  92  +  5093) 
9o  -  9l  +  92  -  ?3 
2(929a  -  9o9i) 


2(?i 93  -  9092) 
2(9293  +  9o9i) 
9o  “  9i  “  9-2  +  93 


The  rotation  matrix  B  is  orthogonal.  Hence  B~l  =  B7 .  The  forces  and  moments  on  the 
right  hand  side  are  functions  of  vb  and  wg.  The  Eq.4.12  is  nonlinear  since  the  coefficient  matrix 
B,  Qg  and  Qq  are  functions  of  the  state  variables.  The  system  of  equations  is  numerically 
integrated  in  time  using  a  four  stage  Runge-Kutta  method.  Earth’s  rotation  rate  is  not  considered 


in  the  present  work. 


4.2  Results 

In  order  to  validate  the  present  code  a  test  case  with  no  forces  is  considered  since  an  exact 
solution  can  be  obtained  by  integrating  the  state  equations.  When  there  are  no  forces  acting  on 
the  body,  the  rotational  equation(Ref.  [25])  reuures  to 


A.u>\  —  (B  —  £7)cc  2^3 


(4.13) 


Bu 2  =  (C  -  A)u 3U1 


(4.14) 


Cu> 3  =  {A  -  B)uiU>2 


(4.15) 


where  A,B,C  are  the  principal  moments. 
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Figure  4.1:  Comparison  of  the  exact  and  computed  solution  for  wj 

Consider  the  case  when  two  principal  moments  of  inertia  are  equal  A=B;  then,  cJ3  =  0  or 
w3  =  h,  a  constant.  The  initial  conditions  at  t=0  are  wj  =  a,  u2  =  0  and  =  c  =  h.  The  exact 
solution  (Ref.  [25])  is  obtained  by  integrating  the  Eq.(4.13,4.14)  and  is  given  by 


o 

o 

<3 

II 

3 

(4.16) 

u>2  =  sin(Xt) 

(4.17) 

cj3  =  constant  =  h 

(4.18) 

where  A  =  ac. 

800  time  steps  are  used  to  reach  the  final  time  t=40.  It  can  be  seen  from  Figures  1  and  2  that 
there  there  is  a  good  agreement  between  the  exact  and  computed  solutions. 


CHAPTER  V 


COMPUTATIONAL  METHODOLOGY 

This  chapter  is  intended  to  outline  each  of  the  techniques  used  to  construct  the  present 
Navier-Stokes  unstructured  solution  algorithm. 

5.1  Governing  Equations 

The  solution  algorithm  discussed  in  the  present  work  is  capable  of  handling  both  the  Euler 
and  Navier-Stokes  equations.  Thus,  the  equations  given  here  are  the  full  3D  Navier-Stokes 
equations,  with  the  understanding  that  the  the  Euler  equations  do  not  contain  the  viscous  terms. 
The  unsteady  three-dimensional  compressible  Reynolds-averaged  Navier-Stokes  equations  are 
presented  here  in  Cartesian  coordinates  and  in  conservative  form.  The  nondimensionalized 
equations  can  be  written  in  integral  form  as: 

4  f  QdV+  f  F-kdA=^  [  G-kdA  (5.1) 

at  Jn  Jan  Re  Jan 

P 

pu 
pv 
pw 
e 

pu  i  r  pv  i  pw 

pu(u-Vx)+p  pu(v-Vy )  pu(w-Vz) 

F=  pv{u-Vx)  z  +  pv  (u  -  Vy)  +p  j+  pv  (w  —  Vz)  k 
pw  (u  —  Vx )  pw  ( v  —  Vy)  pw  (w  —  Vz)  +  p 

.e  (u  —  Vx)  +  pu  J  Le  (V  —  Vy)  +  pv  J  le  (w  —  Vz)  +  pw. 
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where  the  shear  stresses  are  given  as 

Txx  =  {H  +  Ht) 


‘  du  2  /  du  dv  dw 
“ dx  3  \dx  +  dy  +  dz 


(5.2) 


dv  2  f  du  dv  dwY 
=  (^  +  M.)  [2^  -  -  +  _  +  —  jj 


(5.3) 


Tzz  =  (j*  +  Vt) 


^dw  2  fdu  dv  dw\ 
~  dz  3  \dx  dy  +  dz) 


(5.4) 


.  ,  fdu  dv\ 

{^  +  fit)  (%  +  ^j 


(5.5) 


.fdu  dw\ 

=T"=("+'")U+&) 


(5.6) 


,  (dv  dw\ 

=  r-»  =  (" + '“)  + 


'y* 


and  the  heat  flux  is  defined  as 


Yq= - —  (-£-  +  -£-)  VT 

7-1  \Pr  Prt) 


(5.7) 


(5.8) 


where  T  is  the  temperature,  /i  and  yt  are  the  laminar  and  turbulent  viscosities  and  Pr  and  Prt 


are  the  laminar  and  turbulent  Prandtl  numbers  respectively.  The  above  equations  are  closed  by 


the  equation  of  state  for  an  ideal  gas,  i.e., 


e  =  — ^  (u2  +  v2  +  u-2)  (5.9) 

7  -  1  2  '  y 

where  7  is  the  ratio  of  specific  heats  and  is  taken  to  be  1.4.  The  above  equations  were 
nondimensionalized  with  respect  to  the  freestream  speed  of  sound  (a0 0>,  freestream  density  (px ) 
characteristic  length  scale  (L),  and  the  freestream  viscosity  {poo).  Thus,  Re  =  Pod'oeL/ px  and 
Moo  is  the  freestream  Mach  number.  The  nondimensional  pressure  is  defined  as  p  = 
where  p~  is  the  local  static  pressure.  Obviously,  for  laminar  flow,  pt  =  0. 

5.2  Spatial  Discretization 

Before  discretization  takes  place,  a  control  volume  must  be  defined.  The  approach  used 
in  the  present  work  is  to  define  a  control  volume  surrounding  each  vertex;  thus,  the  solution 
technique  is  referred  to  as  a  vertex-centered  (or  node-centered)  scheme. 

All  solution  variables  are  stored  associated  with  control  volumes.  To  define  specifically  the 
control  volume  boundaries,  the  median  dual  is  used;  this  dual  construction  consists  of  connecting 
the  centroid  of  each  incident  element  to  the  midpoint  of  each  incident  edge.  The  non  overlapping 
volumes  formed  by  this  procedure  are  defined  to  be  the  control  volumes  over  which  flux  balances 
are  performed.  The  definition  of  the  median  dual  in  two  dimensions  in  shown  in  Figure  5.1. 
Note  that  in  three  dimensions,  element  centroids  are  connected  to  face  centroids  as  well  as  edge 
midpoints;  this  also  forms  a  closed  control  volume  around  a  vertex. 

The  governing  equations  are  discretized  using  a  finite  volume  technique;  thus,  the  surface 
integrals  in  Equation  5. 1  are  approximated  by  a  quadrature  over  the  surface  of  the  control  volume 
of  interest.  So,  the  numerical  discretization  of  the  spatial  terms  associated  with  the  control 
volume  surrounding  vertex  0  results  in 
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Figure  5.1 :  Control  volumes  are  defined  as  median  duals  surrounding  each  vertex. 


where  the  spatial  residual  3?  contains  all  contributions  from  the  discrete  approximation  to  the 
inviscid  and  viscous  terms  (3?  =  3R,nt,  +  3RUIS).  Also,  the  quantity  q  is  defined  as  q-  fn  QdV. 


5.2.1  Inviscid  Terms 

Now,  the  integral  for  the  inviscid  terms  must  be  approximated  by  a  discrete  sum;  quadrature 
points  are  chosen  as  the  midpoint  of  each  edge  incident  to  the  vertex.  Doing  this,  the  flux  vector 
is  replaced  by  a  suitable  approximation  3?  (termed  the  “numerical  flux  vector”),  therefore  arriving 
at 

3^0,*nt)  =  ^  '  #0,  ’  ^0 i  (511) 

i€A/’(  0) 

The  numerical  flux  is  calculated  using  the  Roe  scheme,  which  solves  a  one  dimensional 
approximate  Riemann  problem  given  the  two  solution  states  on  each  side  of  the  control  volume 
face: 

*  =  \  (F(Ql)  +  F(Qr))  -  \A  (( Qr ,  Ql )  ( QR  -  QL)  (5.12) 

where  A  =  RKR~l .  The  matrix  R  is  a  matrix  constructed  from  the  right  eigenvectors  of 
the  flux  Jacobian,  and  A  is  a  matrix  whose  diagonal  entries  contain  the  absolute  values  of  the 
eigenvalues  of  the  flux  Jacobian.  All  quantities  with  the  “  character  denote  that  they  are  evaluated 
at  an  averaged  solution  state  (between  Ql  and  Qr)',  for  this  compressible  flow  system,  this  state 
is  simply  the  arithmetic  average  of  Ql  and  Qr. 
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5.2.2  Viscous  Terms 

The  viscous  terms  can  be  discretized  by  any  of  several  methods;  in  general,  either  a  finite 
element  or  a  finite  volume  technique  is  employed.  Development  of  the  various  techniques  is 
given  below;  for  now,  it  suffices  to  note  that  the  viscous  terms,  since  they  are  linear,  reduce  to 
simple  linear  combinations  of  surrounding  vertices  (assuming  that  a  nearest-neighbor  stencil  is 
maintained): 

=  ~  E  CiiQi-Qo)  (5-13) 

»<=  Af(0) 

where  C,  is  a  matrix  containing  the  coefficients  that  reflect  the  viscous  behavior.  The  task  of  the 
finite  element  or  finite  volume  method  is  to  specify  these  coefficients,  which  depend  on  geometry 
only. 

5 .2.2. 1  Galerkin  Finite  Element  Method 

First,  the  conservation  statement  is  written  in  differential  form: 

^  +  V-F-  -Fv-G  =0  (5.14) 

dt  Re 

s-— 1 ■— 1 v1  1  ^ 

viscous  terms 

In  this  development,  only  the  viscous  terms  are  considered;  so,  the  inviscid  term  will  be 
neglected.  Now,  let  the  linear  operator  V  be  defined  as 

”«>=*  ~TeVS  <515) 

The  finite  element  method  multiplies  the  operator  equation  by  a  basis  function  and  integrates  the 
result  over  the  spatial  domain.  In  this  case,  the  basis  function  4>  is  chosen  to  be  a  linear  function 
in  each  element  that  is  defined  as  unity  at  the  vertex  in  question,  and  zero  at  each  of  the  neighbor 
vertices.  Letting  T  be  the  union  of  all  elements  that  intersect  vertex  0, 


J^<j>V{Q)dQ  =  0 


(5.16) 
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Note  that  the  integration  is  zero  outside  of  the  domain  of  influence  of  <p,  because  of  the  definition 
of  the  basis  function  above.  The  domain  of  influence  of  <t>  is  simply  the  union  of  all  elements 
intersecting  vertex  0.  Now,  the  definition  for  V(Q)  may  be  inserted,  and  simultaneously  the 
integration  is  broken  into  pieces.  Note  that  this  operation  is  done  without  approximation: 


dQo  r 

dt  Jr 


d>dQ  — 


<£eV  -GedQ  =  0 


(5.17) 


Note  that  Q  is  a  volume  averaged  dependent  variable  vector,  which  is  equivalent  to  using  a 
lumped  mass  matrix  (the  definition  is  the  same  as  that  proposed  in  the  finite  volume  method). 
Now,  using  the  following  product  rule, 


V  •  =  <?>V  •  / +  f •  V<£ 


(5.18) 


a  substitution  may  be  made  to  obtain  the  following: 

d4r  f  £  f  V.(<kGe)dQ+-^  £  /  Ge-V<Mfi  =  0  (5.19) 

dt  Jr  Ree^a)Jr'  ReeZ?(0)Jr' 

The  second  term  in  the  above  expression  can  be  converted  to  a  surface  integral  via  the  divergence 
theorem.  Then,  it  is  trivial  to  see  that  these  boundary  terms  are  identically  zero,  since  the 
basis  function  (f)  is  zero  on  the  boundary  of  the  integration  (the  outer  boundary  of  the  union 
of  elements).  Using  the  fact  that  V</>e  can  be  computed  exactly  if  one  uses  linear  elements  (a 
simple  application  of  Green’s  theorem  on  a  given  element  proves  this), 


V  (fie  “ 


-1 

Tldim  r e 


ne 


(5.20) 
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where  ne  is  an  outward  pointing  normal  for  the  exposed  face  of  element  e.  Equation  5.20  may 
be  immediately  substituted  into  Equation  5.19  to  obtain 


Vo 


dQo 

at 


dim 


£  <?, 


ne  =  0 


e€f(0) 


(5.21) 


Note  that  the  integral  has  disappeared,  since  between  Equation  5.19  and  Equation  5.21  the 
constant  terms  Ge  and  ne  are  removed  from  the  integral.  Also,  the  integral  of  the  basis  function 
over  the  domain  of  integration  T  is  equal  to  T/  (ndim  +  1)  =  V  as  long  as  the  control  volume  is 
defined  using  a  median  dual. 

The  viscous  flux  vector  Ge  in  turn  depends  on  the  solution  gradients.  So,  a  means  must  be 
provided  to  evaluate  the  solution  gradient  in  a  particular  element,  since  this  is  the  entity  in  which 
Ge  is  evaluated.  For  a  simplical  element,  Green’s  theorem  may  be  utilized: 


VxbdQ  = 


(5.22) 


where  xb  is  a  generic  solution  variable.  Using  this  expression  for  evaluating  the  gradient  in  an 
element,  it  is  possible  to  complete  the  approximation  for  the  viscous  terms. 


5 .2.2.2  Directional  Derivative  Method 

A  primary  disadvantage  of  the  traditional  finite  element  method  is  that  geometric  data  is 
needed  that  is  nonlocal  to  a  particular  edge,  unless  edge  coefficients  are  stored  a  priori  (and  the 
nonlocal  maps  discarded).  However,  the  storage  of  these  edge  coefficients  is  very  expensive, 
since  six  coefficients  per  edge  must  be  stored  in  a  three  dimensional  discretization.  Further,  for 
a  finite  element  method,  different  integration  coefficients  are  needed  depending  on  the  element 
type  in  a  multielement  grid.  For  general  element  grids,  it  is  expedient  to  use  only  edge-local 
information  to  compute  the  viscous  fluxes;  this  allows  the  evaluation  of  the  viscous  fluxes 
associated  with  each  face  of  the  control  volume  without  regard  to  the  varying  element  types  of 


49 


the  mesh.  An  algorithm  in  which  no  element  information  is  used  outside  of  metric  computations 
is  termed  a  “grid  transparent”  algorithm  [26]. 

To  address  this  difficulty,  one  can  use  a  finite  volume  technique  with  a  direct  approximation 
for  the  gradients  at  the  quadrature  points;  one  such  method  that  uses  this  approach  is  termed  the 
“directional  derivative”  technique  [27],  [28].  The  overall  approach  is  to  combine  data  obtained 
from  the  solution  gradients  at  the  vertices  (which  may  have  already  been  computed  for  dependent 
variable  extrapolation)  and  edge  local  data  to  approximate  the  gradients  required  in  the  viscous 
terms: 

V Qij  —  VQ  ij.norm  +  VQ,  j,tan  (5.23) 

Using  a  directional  derivative  along  the  edge  to  approximate  the  normal  component  of  the 
gradient  and  the  average  of  the  nodal  gradients  (each  edge  is  connected  to  nodes  i  -  j)  to 
approximate  the  tangential  component  of  the  gradient, 

(' VQ,  y  •  s)  S  »  S  (5.24) 

(vQtJ  •  t)  t  «  VQ  -  (VQ  ■  s)  s  (5.25) 

where  s  is  a  unit  vector  in  the  direction  of  the  edge,  t  is  a  unit  vec'r  r  in  a  direction  normal  to  the 

edge,  VQ  =  |  (VQ,  +  VQj),  and  As  =  Xj  -  x,.  Combining  equation  5.24  and  Equation  5.25 

leads  to  the  following  formula  for  the  edge  gradient: 

VQ.j  «  VQ  +  [Qj  -  Qi-VQ  ■  As]  ^  (5.26) 

Since  the  midpoint  of  the  edge  is  the  quadrature  point  for  the  current  finite  volume  method,  this 
approximation  may  be  used  directly  to  evaluate  the  full  viscous  flux  vector  associated  with  the 
control  volume  face.  Typically,  the  weighted  least  squares  method  is  used  to  evaluate  the  nodal 
gradients  in  the  preceding  formula. 
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Because  of  the  use  of  vertex  gradients,  the  stencil  for  this  method  is  no  longer  a  nearest- 
neighbor  stencil.  Since  the  data  structures  used  for  storing  the  sparse  matrix  cannot  typically 
support  stencils  larger  than  nearest-neighbor,  part  of  the  residual  linearization  from  the 
directional  derivative  method  must  be  neglected  on  the  left  hand  side.  The  left  hand  side  terms 
can  potentially  include  all  contributions  from  nodes  ?,  j,  and  other  nodes  m  the  local  stencil,  but 
not  contributions  from  distance-two  nodes.  In  the  current  approach,  all  possible  contributions  in 
the  nearest-neighbor  stencil  are  taken  into  account. 

5.2.3  Higher  Order  Accuracy 

A  second  order  (in  space)  method  for  the  inviscid  terms  is  constructed  by  extrapolating  the 
solution  at  the  vertices  to  the  faces  of  the  surrounding  control  volume.  Either  Green’s  theorem 
or  a  least  squares  method  is  used  to  compute  the  gradients  at  the  vertices  for  the  extrapolation. 
With  these  gradients  known,  the  variables  at  the  interface  are  computed  as 

Qij  =  Qo  +  VQo  *  r  (5.27) 

where  r  is  defined  a si/  -  xq\  the  position  xj  is  the  midpoint  of  the  edge  (the  quadrature  point 
for  the  control  volume  face).  To  compute  the  gradient  via  Green’s  theorem,  the  following  simple 
formula  is  used: 

[  VQdV=  f  QfidA  (5.28) 

Ja0  Jd$i  o 

Assuming  a  linear  distribution  in  each  element  and  noting  that  the  area  of  integration  is  made  up 
of  discrete  pieces, 

VQo  =T  £  kQo  +  Qi)  (5-29) 

V°.€^(0)2 

This  formula  is  equally  applicable  in  two  and  three  dimensions;  of  course,  special  consideration 
must  be  taken  at  the  boundaries  such  that  a  constant  gradient  is  recovered  if  a  linear  distribution 
is  input. 
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5.3  Temporal  Discretization 

After  the  spatial  terms  have  been  suitably  discretized,  the  time  derivative  term  appearing 
in  Equation  5.10  must  be  approximated.  A  general  difference  expression  is  available  for  this 
purpose  ([29],  [30]),  and  is  given  as  follows: 


A  g"  = 


8\At  d 
1  &2  dt 


(A  <?")  + 


At  d 
1  62  dt 


(<?")  + 


h 

l  +  82 


A  qn~l 


(5.30) 


where  Aqn  —  qn+1  -  qn.  A  first  order  accurate  in  time  Euler  implicit  scheme  is  given  by  the 
choices  #i  =  1,  62  —  0.  Correspondingly,  a  second  order  time  accurate  Euler  implicit  scheme 
is  given  by  8\  =  1,  02  =  1/2.  Since  8\  =  1  for  both  time  discretizations  used  in  this  work, 
Equation  5.30  can  be  further  simplified: 


Aqn  = 


At  d 
\  82  dt 


62 

1  +  #2 


A  g""1 


(5.31) 


Using  Equation  5.10  to  replace  the  time  derivative, 


A  qn 


h. 

1  +  ^2 


Ag""1 


At 


(5.32) 


By  the  definition  of  q,  one  can  write  q  ■—  QV.  Then,  the  following  two  identities  can  be  formed: 


Aqn  =  (gv)n+1  -  {QV)n  =  Vn+1AQ"  +  0nAVn  (5.33) 


Ag""1  =  (QV)n  -  (QV)n_1  =  V"-1  AQ”"1  +  0nAV"_1  (5.34) 

Inserting  the  above  two  identities  into  Equation  5.32,  one  arrives  at  the  following  expression: 

Vn+lAQn-^-Vn~lAQn-1  . 

At  +Q 


AV"  -  if^-AV""1 


At 


1  +  8: 


-9?n+1  =  0  (5.35) 


Now,  one  must  consider  the  Geometric  Conservation  Law  (GCL).  This  statement  relates  the 


rate  of  change  of  a  physical  volume  to  the  motion  of  the  volume  faces: 


dV 

dt 


[  V  •  VsdV  =  [  V.-kdA 
Jn  Jan 


(5.36) 


According  to  Thomas  and  Lombard  [31]  and  later  Janus  [32],  the  solution  of  the  volume 
conservation  equation  must  be  performed  in  exactly  the  same  manner  as  the  flow  equations 
to  ensure  that  GCL  is  satisfied.  This  procedure  ensures  that  spurious  source  terms  caused  by 
volume  changes  are  eliminated.  Using  the  same  time  differencing  expression  (Equation  5.31)  to 
approximate  Equation  5.36, 


AVn 


AV""1 


At 


1 

1  +  02 


^GCL 


(5.37) 


where  &gcl  =  H.eA' (0)  Vq,+1  •  njjf1.  Note  that  the  left  hand  side  of  the  preceding  equation 
is  exactly  the  bracketed  term  in  Equation  5.35.  Replacing  the  bracketed  term  and  rearranging 
slightly  gives  the  final  form  of  the  discretization  of  the  time  derivative: 


(1  +  02)Vn+1AQn  -  02 Vn_1  A Qn~l 
At 


+  <T^gcl  +  ^n+1  =  0 


(5.38) 


5.4  Time  Evolution 

The  final  version  of  the  time  discretization  of  the  governing  equations  (Equation  5.38) 
indicates  that  the  spatial  residual  must  be  evaluated  at  time  level  n  +  1.  Obviously,  the  solution 
state  at  this  time  level  is  unknown;  to  solve  this  nonlinear  equation,  one  must  linearize  about  the 
known  solution  Qn.  One  technique  for  doing  so  is  to  use  Newton’s  method;  following  Equation 
5.38,  let 
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At 


The  quantity  Qn+1  is  the  function  that  should  be  driven  to  zero  by  the  Newton  iteration. 
Expanding  $n+1  in  a  Taylor  series  from  a  known  level  n  +  l.m, 
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Dropping  the  U{At 2j  error  term,  utilizing  the  chain  rule,  and  replacing  ^  with  a  first-order 
difference, 
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Since  the  LHS  of  the  above  equation  is  zero  at  Newton  convergence, 
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where  AQn+1,m  =  Q"+i.m+1_Q”+1-m.  Now,  expanding  the  terms  and  performing  the  required 
differentiations  of  9  results  in  the  following  expression  for  Newton’s  method: 
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(5.43) 


where  AQq-1  -  Qo  —  Qo'1-  For  notational  convenience,  both  the  inviscid  and  viscous  terms 
are  collapsed  into  a  single  flux  function  H.  Note  that  the  iteration  can  be  started  by  using  an 
initial  guess  ofQn+1,°  =  Qn.  Also,  performing  only  one  iteration  of  Newton’s  method  per  time 
step  (with  1st  order  time  discretization  and  no  GCL  terms)  is  equivalent  to  a  time  linearization 
of  the  spatial  terms  only.  However,  writing  the  method  in  this  framework  is  more  general  than  a 
straightforward  time  linearization  of  the  nonlinear  terms. 


54 


It  is  clear  that  Equation  5.43  gives  rise  to  an  algebraic  sparse  matrix  system.  If  the  matrix 
system  is  written  as  Ax  =  b,  the  left  side  of  the  above  equation  represents  b,  A Q  represents  x , 
and  the  coefficients  preceding  A Q  represent  the  sparse  matrix  A.  Furthermore,  the  coefficients 
leading  AQo  make  up  the  diagonal  of  A,  and  the  coefficients  leading  A Q,  are  the  off-diagonal 
elements  of  A. 

Various  methods  are  available  to  solve  this  sparse  system  of  equations;  direct  methods, 
however,  are  impractical  due  to  an  operation  count  of  O(Nb^),  where  bu.  is  the  half-bandwidth 
of  the  matrix.  Iterative  methods  hold  more  promise  in  terms  of  practicality,  and  can  be 
loosely  divided  into  matrix  splitting  relaxation  methods  and  gradient-based  techniques.  In  the 
present  solution  algorithm,  the  Jacobi  method  (relaxation)  and  symmetric  Gauss-Seidel  method 
(relaxation)  as  techniques  to  solve  the  linear  system  are  investigated. 


5.4.1  Jacobi  Iteration 

The  Jacobi  iterative  solver  splits  the  matrix  into  a  diagonal,  upper  triangular,  and  lower 
triangular  part: 


A=  [C  +  V+U] 


(5.44) 


and  defines  the  iteration  as  follows: 


X>A<2n+1'm+1’*:+1  =  -[C+U]  A <2"+i,m+U-  (5.45) 


where  AQn+1,m+1,fc+1  =  —  Qn+ 1>ro.  The  advantage  of  this  solution  method  is 

that  the  matrix  multiply  [C  +  U]AQn+hm+1’k  is  very  easy  to  carry  out;  however,  the  primary 
disadvantage  is  that  the  method  typically  yields  very  slow  convergence.  Implementation  of  this 
technique  uses  a  single  loop  over  the  edge  structures  to  multiply  the  off-diagonal  terms,  followed 
by  a  backsubstitution  to  solve  A<jn+1,m+1,fc+1  =  V~'RHS.  A  variant  of  this  algorithm  is  to 
use  “coloring”  such  that  the  convergence  of  the  algorithm  is  improved.  For  example,  if  one  colors 
all  odd-numbered  nodes  “red”  and  all  even-numbered  nodes  “black”,  the  following  Red-Black 


Jacobi  scheme  is  given  (sometimes  given  the  misnomer  of  a  Red-Black  Gauss-Seidel  scheme): 


[V]  A =  Rn+I.m  _  [C  +  U]  (a Q”R+lm+l'k  U  A Q"+1-m+1-*)  (5.46) 
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(5.47) 


where  the  subscripts  R  and  B  denote  the  portions  of  the  reference  vector  which  belong  to  the 
red  and  black  sets,  respectively.  As  the  number  of  colors  approaches  the  number  of  nodes,  the 
colored  Jacobi  scheme  approaches  the  unidirectional  Gauss-Seidel  scheme. 

5.4.2  Symmetric  Gauss-Seidel  Iteration 

The  symmetric  Gauss-Seidel  (SGS)  matrix  solution  method  begins  by  splitting  the  matrix 
into  an  upper  and  lower  triangular  part: 


A=[C  +  V  +  U ] 


(5.48) 


Where  the  diagonal,  upper,  and  lower  operators  are  defined  by 
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Then,  the  symmetric  Gauss-Seidel  method  can  be  written  as  the  following  two-step  process  per 


iteration: 


[C  +  V]  AQn+1’m+1’fc+5  +  [U]  AQn+1,m+1,k  =  3?n+1,m 
[V  +  U]  A Qn+hm+hk+l  +  [£]  Agn+l,m+l,fc+i  _ 
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Typically,  the  initial  guess  for  A<5n+1’m+1,°  is  zero.  The  implementation  of  this  algorithm  is 
particularly  simple;  for  the  first  pass,  sweep  forward  through  the  vertices.  For  every  vertex, 
multiply  the  off-diagonal  terms  by  the  most  recent  solution  stored  in  a  buffer  A Q,  subtract  them 
from  the  corresponding  element  in  ^n+1,m,  solve  the  system  (5x5  system,  for  3D  compressible 
flows)  Vx  -  RHj,  and  copy  the  solution  back  into  the  A Q  buffer.  For  the  second  sweep, 
perform  exactly  the  same  operations  but  instead  loop  backward  though  the  vertices  instead  of 
forward.  Note  that  the  Gauss-Seidel  algorithm  requires  a  vertex  to  surrounding  edge  map,  since 
the  sparse  matrix-vector  multiplies  must  be  undertaken  on  a  row-by-row  basis. 

5.5  Turbulence  Modeling 

A  model  for  the  effects  of  turbulence  is  a  necessary  component  for  simulating  high  Reynolds 
number  flows.  In  the  present  work,  the  turbulence  model  is  incorporated  in  a  “loosely-coupled” 
procedure;  that  is,  the  mean  flow  equations  are  solved  first,  and  then  the  turbulence  model  is 
solved  independently.  Coupling  between  the  two  is  accomplished  since  the  turbulence  model 
uses  the  most  recently  computed  solution  (Qn+1),  and  the  solution  of  the  core  governing 
equations  uses  the  most  recently  computed  eddy  viscosity  (^").  Figure  5.2  outlines  this 
procedure. 


5.5.1  Spalart-Allmaras  Turbulence  Model 

The  one-equation  turbulence  model  of  Spalart  and  Allmaras  is  available  for  high  Reynolds 
number  flows  [33];  this  model  formulates  a  transport  equation  for  the  turbulent  Reynolds 
number,  which  is  then  related  to  the  turbulent  viscosity.  From  the  original  Spalart  and  Allmaras 
paper  [33],  a  transport  equation  can  be  written  for  a  working  variable  v  (the  turbulent  Reynolds 
number): 
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Figure  5.2:  The  solution  procedure  that  incorporates  the  turbulence  model  is  decoupled  from  the 
original  system  of  equations. 


-  -  -  -  2 

cwJu-  -  J  [£]  +fa£Ul  (5-54) 

s - y  trip 

destruction 

For  evaluation  of  the  diffusive  term,  it  is  useful  to  slightly  rearrange  Equation  5.54  into  an 
equivalent  form.  Also,  in  this  work,  the  trip  term  is  neglected.  After  nondimensionalization, 
the  equation  for  v  becomes 

c-'u  ~  c^m]  [i\ + 

{V  •  [{v  +  (1  +  c62)  V )  Vi>]  -  C62FV  •  [Vi>]}  (5.55) 

where  the  function  definitions  in  nondimensionalized  form  are 
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(5.58) 

(5.59) 

(5.60) 

(5.61) 

(5.62) 

(5.63) 

(5.64) 

(5.65) 

(5.66) 

(5.67) 

(5.68) 


The  constant  definitions  are  as  follows: 


k  =  0.41  a  =  2/3  cv\  =  7.1 


cbl  =  0.1355,  c62  =  0.622, 

CtM  —  2”  +  (1  +  C52)  / K  Cy,2  —  0.3  Cw3  —  2.0, 

AC 

Cfl  =  1.0  C^2  =  2.0  Q3  =  1.1  Ct4  =  2.0 


Crl  —  1 


Cf*2  —  12  Cr3  —  1 
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Note  that  Equation  5.66  gives  two  options  for  computing  fri.  The  first  option  is  the  default 
value  from  the  original  formulation  of  the  model  [33].  The  second  option  is  a  modification  to  the 
production  term  suggested  in  [34]  to  better  preserve  vortices  in  the  near  field.  Unless  otherwise 
stated,  simulations  are  performed  in  the  present  work  using  the  default  option. 

Equation  5.55  must  be  appropriately  discretized  for  implementation  into  the  unstructured 
solution  procedure.  A  Galerkin  finite-element  method  or  a  directional  derivative  method  (Section 
5.2.2)  is  used  to  discretize  the  diffusive  terms  that  are  inside  of  the  divergence  operator;  the  v 
term  outside  of  the  divergence  is  assumed  to  be  constant  within  the  control  volume.  A  pure 
upwind  method  is  used  to  discretize  the  convective  terms.  The  turbulence  production  and 
destruction  terms  are  evaluated  with  the  assumption  that  v  is  constant  within  the  given  control 
volume.  Each  of  the  terms  are  appropriately  linearized  with  respect  to  time  (and  attention  paid 
to  positivity  considerations)  to  derive  the  sparse  matrix  required  for  implicit  solution  of  the 
governing  equation. 

On  no-slip  surfaces,  the  turbulent  Reynolds  number  (v)  is  defined  to  be  zero  and  therefore 
is  not  solved  for.  At  farfield  inflow  boundaries,  v  is  set  to  a  freestream  value  of  1/10  (as  per 
the  recommendation  of  [33])  for  the  boundary  face  flux  evaluation.  The  dependent  variable 
is  extrapolated  from  the  interior  for  the  corresponding  flux  evaluation  on  farfield  outflow 
boundaries. 


5.6  Parallel  Methodology 

The  present  investigation  explores  several  alternative  methods  of  treating  the  domain 
decomposition,  subdomain  interface  connectivity,  and  subdomain  coupling  in  the  parallel 
unstructured  solution  algorithm.  In  addition,  proper  embedding  of  the  parallelization  within  the 
iteration  hierarchy  is  discussed.  The  discussions  concerning  the  iteration  hierarchy,  subdomain 
coupling,  and  interface  connectivity  apply  primarily  to  the  mean  flow,  but  are  also  equally 
applicable  to  the  solution  of  the  turbulence  model. 
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5.6. 1  Iteration  Hierarchy 

The  notion  of  an  iteration  hierarchy  was  introduced  in  [35].  The  purpose  of  the  levels  of 
the  iteration  hierarchy  is  to  reduce  error  components  that  arise  at  various  stages  of  the  solution 
process.  The  Newton  iteration  reduces  errors  arising  from  the  time  linearization  of  the  nonlinear 
terms,  while  the  inner  subiterations  reduce  error  that  is  caused  by  ine  splitting  of  the  linear 
system.  In  addition,  these  linear  subiterations  coupled  with  subdomain  communication  can  be 
used  to  eliminate  errors  that  occur  due  to  the  loss  of  coupling  caused  by  the  partitioning  of  the 
domain  into  subdomains. 

The  sequential  iteration  hierarchy  utilized  in  [36]  greatly  reduces  memory  requirements  by 
sequential  solution  within  each  subdomain.  The  procedure  allows  reuse  of  storage  for  Jacobians 
and  other  memory  intensive  operations,  and  allows  for  one  update  of  all  communicated  quantities 
at  the  beginning  of  each  time  step. 

Since  parallel  concurrency  allows  one  to  update  at  any  desired  point  in  the  iteration  hierarchy, 
it  is  possible  to  relax  the  restrictions  of  the  hierarchy  given  in  [36]  to  formulate  a  more  flexible 
updating  strategy.  Reorganization  of  the  updating  structure  is  intended  to  address  shortcomings 
of  the  sequential  iteration  hierarchy: 

•  since  gradients  are  updated  between  time  steps,  gradients  on  the  interface  are  effectively 
lagged.  Thus,  the  residual  is  computed  using  n  time  level  gradients  on  the  interior,  but 
n  —  1  time  level  gradients  on  the  block  interfaces.  This  leads  io  an  inconsistency  in  the 
residual  computations  at  the  subdomain  interfaces. 

•  since  the  solution  Q  is  updated  at  the  beginning  of  the  time  step,  it  is  impossible  to  carry 
out  Newton  iterations,  if  desired. 

•  no  AQ  information  is  passed  during  sparse  matrix  solves,  which  forces  a  non-implicit 
handling  of  the  subdomain  interfaces.  As  the  number  of  subdomains  increase,  the  sparse 
matrix  solver  suffers  from  convergence  degradation. 

The  proposed  iteration  hierarchy  for  the  present  work  is  shown  in  Figure  5.3,  where  two 
updating  modes  are  defined:  concurrent  subdomain  iteration  and  sequential  subdomain  iteration. 
The  sequential  mode  of  subdomain  iteration  is  the  same  as  that  presented  in  [36].  In  concurrent 
mode,  updating  is  carried  out  such  that  interface  quantities  are  updated  at  the  same  time  as  they 


61 


would  be  in  a  single  subdomain  algorithm.  This  provides  the  correct  time  and/or  iterative  level 
for  each  of  the  quantities  that  is  exchanged  via  communication. 

Two  items  should  be  addressed  by  the  iteration  hierarchy  in  a  parallel  context:  1 )  consistency 
of  the  residual  and  2)  alleviation  of  degradation  in  convergence. 

Residual  consistency  ensures  that  the  residual  computed  for  every  node  in  the  domain  is 
the  same  regardless  of  the  number  of  subdomains.  Since  the  residual  is  a  function  of  x,  Q, 
and  VQ,  it  is  sufficient  to  make  copies  of  these  data  available  on  the  subdomain  interfaces 
such  that  the  computed  residual  on  each  side  of  these  interfaces  is  the  same.  Alternatively,  one 
subdomain  can  be  responsible  for  computing  the  residual  for  an  interface  node,  and  this  residual 
is  then  communicated  to  the  other.  The  concurrent  subdomain  iteration  relies  on  the  former 
technique,  which  is  to  distribute  each  quantity  that  is  used  in  the  residual  calculation  (x,  Q ,  VQ) 
and  compute  the  interface  fluxes  redundantly  in  each  subdomain.  Because  the  present  solution 
algorithm  uses  coarse-grained  parallelism  (the  computational  volume  is  large  compared  to  the 
communication  surface),  the  cost  of  this  redundancy  is  of  small  consequence. 

Secondly,  the  hierarchy  should  allow  for  the  communication  necessary  for  the  parallel 
algorithm  to  display  convergence  characteristics  reasonably  close  to  its  serial  counterpart.  In 
concurrent  subdomain  iteration,  this  is  accomplished  by  allowing  communication  of  A Q  during 
the  subiterations  so  that  the  parallel  algorthm  is  able  to  access  the  entire  vector  corresponding 
to  the  complete  domain  rather  than  just  the  vector  belonging  to  a  particular  subdomain.  In  this 
way,  sparse  matrix-vector  multiplies  can  be  carried  out  in  a  parallel  context  that  produce  the 
same  results  as  in  a  sequential  context. 

Note  that  the  sequential  updating  mode  given  in  Figure  5.3  does  not  allow  for  residual 
consistency  or  alleviation  of  convergence  degradation.  Although  this  characteristic  is  not 
desirable,  this  variation  of  the  hierarchy  is  present  to  support  the  memory  leveraging  procedure 
published  in  [36].  Using  this  variation,  solutions  may  be  performed  using  (typically)  1/5  of  the 
memory  consumed  by  the  serial  algorithm,  since  one  is  able  to  cycle  each  block  in  sequence 
and  reuse  memory  abandoned  by  the  last  subdomain  in  the  cycle.  The  sequential  updating  mode 
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also  allows  one  to  map  more  than  one  subdomain  onto  a  particular  processor.  This  capability 
is  available  at  the  cost  of  some  convergence  degradation  and  can  be  used  for  steady  flows  only, 
where  the  residual  inconsistency  on  the  interfaces  is  inconsequential  at  algorithm  convergence. 

5.6.2  Subdomain  Interface  Treatments 

Two  typ^s  of  subdomain  interfaces  are  considered.  The  first  is  a  surface  interface  in  which 
the  decomposition  is  along  a  surface  made  up  of  element  faces,  as  shown  in  Figure  5.4;  this  is 
the  connectivity  technique  employed  by  Sheng  and  Whitfield  [36].  The  second  is  a  mesh  vertex 
decomposition  in  which  distinct  node-based  control  volumes  are  assigned  to  subdomains  as  in 
Figure  5.5;  the  interface  thus  lies  on  the  dual  to  the  mesh.  Each  technique  entails  a  different 
storage  and  communication  paradigm. 

A  clarification  should  be  issued  here  in  regard  to  the  terminology  used  in  this  work;  whereas 
“element-based”  and  “vertex-based”  decompositions  indicate  directly  the  entity  that  is  separated 
between  the  subdomains,  a  “control-volume-based”  decomposition  is  slightly  ambiguous.  In  a 
node-based  solution  scheme  (as  in  the  current  work,  control  volumes  are  constructed  around  each 
node),  a  control  volume  decomposition  is  equivalent  to  a  nodal  decomposition,  since  a  vertex  and 
a  control  volume  have  a  one-to-one  correspondence.  However,  in  a  cell-based  solution  scheme, 
a  control  volume  decomposition  corresponds  to  an  element-based  connectivity  scheme,  since 
each  element  corresponds  to  one  control  volume  (as  in  [37]).  The  two  types  of  decompositions 
discussed  here,  the  element-based  and  node-based  decomposition,  are  presented  in  the  context 
of  a  nodal  control  volume  solution  technique. 

5.6.3  Element-based  Connectivity  Scheme 

To  establish  an  element-based  subdomain  connectivity,  “phantom”  entities  are  created  for 
every  primitive  in  the  mesh.  This  involves  creating  lists  of  phantom  nodes,  elements,  boundary 
facets,  and  edges.  It  is  important  to  note  that  since  elements  are  assigned  uniquely  to  each 
partition,  nodes  on  a  subdomain  interface  are  duplicated  in  at  least  two  blocks;  thus,  the 
ownership  status  of  these  nodes  is  ambiguous.  A  diagram  of  the  element-based  connectivity 
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Figure  5.4:  Schematic  of  the  element-based  connectivity  scheme 


scheme  is  shown  in  Figure  5.4.  Note  that  global  node  numberings  are  listed  first,  and  subdomain 
local  numberings  are  listed  second. 

In  a  given  subdomain,  phantom  nodes  are  created  for  every  vertex  on  the  subdomain  interface 
as  well  as  for  every  node  connected  to  the  interface  in  the  bordering  subdomain.  To  maintain 
connectivity  for  these  newly  created  vertices  it  is  necessary  to  construct  phantom  elements, 
phantom  boundary  facets,  and  phantom  edges.  Using  these  newly  created  entities,  one  can  treat 
the  connectivity  structure  as  a  layer  of  interior  entities  that  can  distribute  or  accumulate  computed 
quantities  as  needed.  Note  that  this  treatment  leads  to  a  distance-two  overlap  of  the  subdomains. 

To  compute  residual  and  Jacobian  contributions  from  adjacent  subdomains,  loops  are 
constructed  over  phantom  entities  which  perform  exactly  the  same  operations  as  the 
corresponding  loops  over  the  physical  mesh  entities.  Therefore,  each  loop  in  the  solver  is 
followed  by  a  nearly  identical  loop  over  the  associated  phantom  primitive.  In  these  secondary 
loops,  tests  are  performed  on  the  nodes  owned  by  the  phantom  entities  such  that  data  scatters  are 
undertaken  appropriately.  A  pseudocode  example  (for  computing  the  inviscid  residual)  is  below: 
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do  i  =  1 , nedge 

nl  =  node  1  of  edge  i 

n2  =  node  2  of  edge  i 

gather  geometry  and  solution  information  for  i ,  nl,  n2 
compute  flux  along  edge  i 
add  flux  to  residual  for  nl 
subtract  flux  from  residual  for  n2 
enddo 

do  i  =  1 , neph 

nphl  =  phantom  node  of  phantom  edge  i 

nph2  =  phantom  node  of  phantom  edge  i 

nl  =  node  associated  with  nphl 
n2  =  node  associated  with  nph2 

gather  geometry  and  solution  information  for  i ,  nphl,  nph2 
compute  flux  along  phantom  edge  i 
if  nl  exists,  add  flux  to  residual  for  nl 
if  n2  exists,  subtract  flux  from  residual  for  n2 
enddo 


5.6.4  Node-based  Connectivity  Scheme 

An  alternative  method  to  handle  the  subdomain  interfaces  is  to  uniquely  assign  each  vertex 
to  a  particular  subdomain;  this  treatment  is  termed  a  node-based  interface  scheme.  A  diagram  of 
this  interface  connectivity  scheme  is  shown  in  Figure  5.5.  Note  that  global  node  numberings  are 
listed  first,  and  subdomain  local  node  numberings  are  listed  second.  Using  this  treatment,  only 
phantom  nodes  must  be  created  to  fully  define  the  connectivity  between  each  subdomain. 

Using  this  control-volume  based  connectivity  scheme,  each  subdomain  appears  to  the  solver 
as  a  complete  domain,  except  that  phantom  nodes  may  exist.  Normal  entities  in  the  grid,  such 
as  elements  and  edges,  may  contain  one  or  more  phantom  nodes.  The  only  special  treatment 
given  to  these  nodes  is  that  accumulations  (if  they  are  performed)  are  ignored,  and  these  vertices 
are  the  points  at  which  incoming  parallel  communications  take  place.  Outgoing  communications 
take  place  from  any  nodes  connected  to  a  phantom  node.  Defining  the  subdomain  connectivity  in 
this  way  leads  to  a  distance-one  overlap  between  subdomains.  To  compute  residual  and  Jacobian 
contributions  across  subdomains,  no  modification  to  the  core  computation  is  required  (in  contrast 
the  the  element-based  connectivity  scheme). 
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Figure  5.5:  Schematic  of  the  node-based  connectivity  scheme 


To  compute  residual  and  Jacobian  contributions  across  subdomains,  only  slight  modification 
to  the  original  code  has  been  performed  in  this  work.  The  modification  consists  of  inserting  a 
check  before  any  accumulation  to  a  node  takes  place,  to  ensure  that  the  node  accumulated  to  is 
not  a  phantom  node.  This  check  is  not  strictly  necessary  (since  accumulations  to  a  phantom 
node  will  be  ignored),  but  is  implemented  for  clarity  as  well  as  efficiency  purposes.  The 
implementation  \z  demonstrated  by  the  following  pseudocode: 

do  i  =  l,nedge 

nl  =  node  1  of  edge  i 
n2  =  node  2  of  edge  i 

gather  geometry  and  solution  information  for  i,  nl,  n2 
compute  flux  along  edge  i 

if  nl  is  not  phantom,  add  flux  to  residual  for  nl 

if  n2  is  not  phantom,  subtract  flux  from  residual  for  n2 

enddo 

Algorithmic,  memory,  and  execution  time  issues  are  associated  with  the  two  connectivity 
scheme  types,  and  it  has  been  found  that  the  node-based  connectivity  scheme  has  several 
advantages: 
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•  improved  convergence  properties 

•  problem  size  does  not  (artificially)  increase  as  number  of  partitions  are  increased:  hence, 
the  algorithm  can  now  be  scalable 

•  less  computation  per  time  step 

•  eliminates  solution  ambiguity,  since  a  specific  subdomain  owns  each  control  volume  an^ 
hence  its  associated  volume  averaged  solution  variables 

•  amount  of  connectivity  information  decreases;  thus,  memory  is  conserved 

•  handles  general  element  grids  and  other  definitions  of  nodal  control  volumes  without 
modification 

For  edge-based  computations  (primarily  the  residual  and  Jacobian)  special  consideration 
must  be  given  to  the  edges  that  span  interpartition  boundaries  (for  ease  of  exposition,  the  residual 
calculations  are  considered  here).  The  difficulty  arises  since  flux  integrals  on  nodes  adjacent 
to  partition  boundaries  cannot  be  completed  without  some  sort  of  communication.  For  these 
edges,  flux  computations  may  be  1)  calculated  in  a  preassigned  subdomain  and  subsequently 
communicated  to  the  neighboring  subdomain,  or  2)  calculated  redundantly  in  each  subdomain. 
Thus,  the  optimal  choice  depends  on  whether  the  speed  of  communicating  the  interface  fluxes  is 
greater  than  the  speed  of  computing  them.  In  this  work,  the  second  option  is  chosen  due  to  1 ) 
in  most  situations,  computing  a  flux  for  an  edge  consumes  less  overall  time  than  communicating 
the  result  of  a  flux  evaluation,  and  2)  message  passing  for  edges  would  entail  the  building  and 
maintenance  of  a  secondary  (expensive)  subdomain  connectivity  structure.  It  should  be  noted 
that  even  if  the  flux  values  are  communicated  instead  of  redundantly  computed,  a  preliminary 
communication  to/from  phantom  nodes  is  still  necessary  to  provide  the  state  vectors  for  the 
Riemann  problem.  In  a  field  solver  where  the  residual  (and  Jacobian)  calculations  are  extremely 
costly  to  compute  (such  as  in  chemically  reacting  flows),  the  most  efficient  choice  could  be  to 
communicate  flux  evaluations  on  interpartition  edges  rather  than  redundantly  compute  them. 

5.6.5  Subdomain  Iteration  Methods 

A  primary  advantage  of  implicit  methods  is  the  global  communication  of  data  that  occurs 
each  time  step.  However,  the  division  of  a  domain  into  subdomains  implies  that,  without 
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specific  procedures  to  ensure  subdomain  coupling,  this  global  communication  degenerates  into 
propagation  of  waves  only  within  the  individual  subdomains.  Obviously,  the  loss  of  global  wave 
propagation  leads  to  a  deterioration  in  convergence  characteristics  of  the  parallel  algorithm  when 
compared  to  the  original  serial  algorithm. 

One  can  loosely  view  the  solution  of  me  sparse  linear  system  (arising  from  an  implicit 
approximation)  as  a  sequence  of  sparse-matrix  vector  products.  Each  task  owns  a  set  of  rows  of 
the  sparse  matrix  and  the  corresponding  section  of  a  vector  to  be  multiplied.  However,  a  given 
task  does  not  necessarily  own  the  entire  section  of  the  vector  that  must  be  multiplied  by  a  given 
row  of  the  sparse  matrix;  so,  one  must  provide  a  mechanism  to  pass  these  nonlocal  entries  of 
the  vector  from  the  owning  task  to  the  task  that  must  use  the  quantity  in  the  multiply  operation. 
Storage  locations  are  set  aside  for  this  procedure  as  described  in  Section  5.6.2. 

The  concurrent  iteration  hierarchy  given  in  Section  5.6.1  allows  for  proper  updating  to  take 
place  during  the  sparse-matrix  vector  multiplies  (linear  subiterations).  It  is  this  communication 
which  provides  the  subdomain  coupling  necessary  to  approximately  maintain  the  convergence 
rate  of  the  serial  implicit  algorithm.  If  the  contributions  from  an  adjacent  subdomain’s  control 
volumes  are  neglected  (termed  a  block  Jacobi  [BJ]  subdomain  coupling  method),  this  places 
an  artificial  boundary  within  the  domain  from  which  no  useful  information  propagates  and  no 
useful  information  can  penetrate.  Hence,  convergence  is  degraded  by  the  presence  of  subdomain 
boundaries  in  the  domain  unless  special  treatment  of  these  interfaces  is  undertaken.  Given  that 
a  relaxation  method  is  used  on  the  interior,  the  BJ  technique  degenerates  into  a  traditional  point 
Jacobi  method  as  the  number  of  subdomains  approaches  the  number  of  nodes  in  the  domain. 

For  relaxation  algorithms  (such  as  Gauss-Seidel),  this  updating  during  subiterations  now 
allows  recovery  of  a  modified  form  of  the  original  algorithm.  The  degree  to  which  the  relaxation 
algorithm  is  recovered  is  determined  by  the  frequency  of  updating  of  the  interface  AQ’s.  A 
weak  coupling  can  be  accomplished  by  updating  A Q  only  after  each  subiteration,  and  maximum 
coupling  can  be  accomplished  by  updating  A Q  at  every  available  point;  in  this  work,  one  has  the 
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opportunity  to  update  between  color  changes  (BGS3),  directional  sweeps  (BGS2,  BGS3).  and 
after  each  subiteration  (BGS1,  BGS2,  BGS3). 

To  establish  terminology  for  the  different  levels  of  subdomain  coupling,  the  acronym  BJ  is 
used  to  denote  a  block  Jacobi-type  iteration,  in  which  the  contributions  from  other  subdomains 
are  neglected.  BGS1,  BGS2,  and  BGS3  all  indicate  that  blocks  are  implicitly  coupled,  the 
strength  of  which  is  determined  by  the  trailing  number.  BGS1  iterations  only  update  interface 
A Q  after  each  subiteration;  BGS2  iterations  update  after  each  subiteration  and  after  each  color 
change;  and  BGS3  iterations  update  after  each  subiteration,  color  change,  and  directional  sweep. 
Note  that  for  the  Red-Black  Jacobi  algorithm  (Section  5.4.1),  BGS2  and  BGS3  are  identical, 
since  there  is  no  directional  sweep  involved.  Likewise,  for  the  symmetric  Gauss-Seidel  algorithm 
(Section  5.4.2),  BGS 1  and  BGS2  are  identical  since  there  is  only  one  node  color  involved.  Figure 
5.6  clarifies  the  four  possible  subdomain  couplings. 

Unfortunately,  if  the  sequential  memory  leveraging  hierarchy  (Section  5.6.1)  is  used,  the 
iteration  hierarchy  is  limited  to  one  update  point  at  the  beginning  of  each  time  step.  Since 
no  update  is  possible  during  linear  subiterations,  it  is  not  possible  to  account  for  neighboring 
subdomain  contributions  during  the  relaxation  algorithm.  Thus,  the  matrix  terms  corresponding 
to  nodes  owned  by  other  subdomains  must  simply  be  neglected.  Unfortunately,  as  mentioned 
previously,  it  is  this  global  communication  of  data  during  the  subrerations  that  gives  rise  to 
the  accelerated  convergence  rates  enjoyed  by  implicit  methods.  Thus,  the  sequential  iteration 
hierarchy  forces  one  to  use  the  BJ  iteration,  in  which  each  subdomain  is  isolated  from  the  rest 
during  the  matrix  solution. 

In  summary,  the  handling  of  contributions  from  neighboring  subdomains  during  the  solution 
of  the  linear  system  strongly  affects  the  convergence  rate  of  the  solver.  One  can  choose  to 
neglect  the  contributions  (block  Jacobi),  or  communicate  these  values  at  various  points  during 
the  subiteratfve  process.  Obviously,  stronger  coupling  between  subdomains  implies  a  higher 
overhead  from  message  exchanges.  The  cost  of  message  passing  on  the  host  architecture 
determines  the  optimal  tradeoff  such  that  total  execution  time  is  minimized. 
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Compute  [A] 

A  Q  4 —  0 


Qn+ 1  <-  Qn  +  A Qn 

*  BGS1  subdomain  coupling 

*  BGS2  subdomain  coupling 

+  BGS3  subdomain  coupling 

Figure  5.6:  Definition  of  the  BGS1,  BGS2,  and  BGS3  subdomain  coupling  techniques;  BJ 
iteration  performs  none  of  the  above  updates  during  the  linear  subiteration. 

5.6.6  Domain  Partitioning 

The  METIS  software  package  [38]  is  used  to  partition  the  unstructured  mesh.  This  package 
implements  a  set  of  multilevel  graph  partitioning  algorithms  [39],  [40],  [41  ]  intended  to  perform 
efficient  partitioning  of  arbitrary  graphs.  This  implementation  is  very  fast  (a  1  million  point  grid 
can  be  partitioned  in  approximately  one  minute)  as  well  as  having  the  ability  to  minimize  the 
number  of  cut  edges  (which  decreases  communication  costs).  Note  that  the  partitions  provided 
by  METIS  are  not  necessarily  contiguous;  however,  that  is  not  a  problem  for  the  current  parallel 
solution  algorithm. 


CHAPTER  VI 


THE  DEVELOPMENT  OF  AN  OBJECT  ORIENTED  VISUALIZATION  TOOLKIT 

Although  the  purpose  of  this  research  was  not  explicitly  focused  on  developing  a 
visualization  toolkit,  the  generation  of  a  significant  number  of  methods  and  algorithms  warranted 
a  closer  look  at  properly  packaging  them  for  use  beyond  the  scope  of  this  problem.  The  following 
sections  discuss  the  visualization  paradigm  used  in  this  research,  the  general  framework  that 
encompasses  the  toolkit,  and  existing  class  definitions  used  for  prototyping  and  developing  the 
toolkit  itself. 


6.1  Visualization  Paradigm  for  This  Research 

The  work  presented  in  this  paper  is  limited  to  the  analysis  and  development  of  the  framework 
and  the  algorithms  in  a  toolkit  that  operate  on  grids  and  solutions.  This  toolkit  is  implemented  as 
a  shared  library  that  may  be  used  in  a  broader  scope  for  a  variety  of  tasks.  The  manner  in  which 
the  toolkit  is  currently  used  in  the  ERC  is  as  a  computational  server  in  the  broader  context  of 
a  visualization  package  called  DIVA  (Data  Interactive  Visualization  and  Analysis).  The  system 
architecture  of  DIVA  is  illustrated  in  Figure  6.1 

The  box  labeled  Data  Vault  represents  the  disk  storage  needed  to  contain  the  input  data.  A 
Compute  Server  is  shown  as  a  separate  entity  that  does  CPU  intensive  calculations,  for  example: 
feature  identification,  isosurfaces,  cutting  planes,  particle  traces,  etc.  This  Compute  Server  is 
the  toolkit  that  contains  the  results  from  this  work.  The  Graphics  Server  is  the  piece  of  the 
architecture  that  handles  taking  graphical  primitives,  such  as  triangle  strips,  and  outputting  them 
to  either  a  file  that  contains  a  three-dimensional  representation  of  the  image,  an  image  that  can 
be  generated  or  rendered  off  screen,  or  a  computer  that  is  capable  of  displaying  interactive  high- 
resolution  displays.  The  connection  between  all  of  these  pieces  can  be  either  high-speed  network 
connections  or  can  be  resident  memory.  If  the  connection  between  the  Data  Vault  and  the 
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Figure  6.1 :  The  system  architecture  for  DIVA 


Compute  Server  is  a  network,  than  the  method  of  data  transmission  is  one  of  either  data  flow  or 
data  streaming,  depending  on  the  type  of  the  input  data.  Because  we  have  imposed  the  restriction 
that  an  input  data  set  need  not  change  format,  it  is  not  possible  to  allow  for  data  streaming  of 
unstructured  input  data.  Structured  input  data,  however,  may  be  streamed  or  decomposed  into 
manageable  units  if  necessary.  If  the  connection  between  the  Compute  Server  and  the  Graphics 
Server  is  a  net  vork,  than  the  method  of  transmission  is  data  streaming.  The  Compute  Server  acts 
as  an  extraction  tool  that  takes  a  requested  piece  of  input  data,  extracts  the  necessary  information, 
and  then  passes  it  to  the  Graphics  Server. 

DIVA  very  closely  matches  the  scientific  visualization  model  as  characterized  by 
Springmeyer  [42].  The  visualization  paradigm  that  DIVA  embodies  is  one  of  focusing  on  the 
underlying  process  that  is  practiced  by  the  user,  one  of  designer-as-apprentice,  discussed  in 
Chapter  3.  The  paradigm  present  in  DIVA  also  takes  into  account  the  data  communication  issues 
that  have  become  a  primary  focus  in  all  visualization  packages  currently  in  use.  The  issues 
that  DIVA  and  the  underlying  toolkits  face  in  terms  of  data  management  are  similar  to  those 
introduced  by  Schroeder  and  Cox  [43], [44],  [45],  in  that  this  research  has  had  to  acknowledge 
and  deal  with  data  sets  that  are  much  too  large  for  resident  memory  capacity. 
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6.2  The  General  Framework  Encompassing  the  Toolkit 

The  Compute  Server  is  implemented  in  C++  as  a  set  of  base  classes.  There  also  exists  a 
set  of  C  wrappers  around  these  base  classes  that  provide  an  external  API  to  the  toolkit.  As 
was  previously  mentioned,  the  Compute  Server  is  compiled  into  a  shared  library  that  can  be 
called  from  an  arbitrary  front  end.  It  is  completely  separate  from  the  Graphics  Server  and  has 
no  specific  ties  to  any  graphics  language.  This  toolkit  serves  simply  to  house  a  set  of  grids  and 
solution,  allowing  for  both  queries  and  extractions  on  this  data.  This  chapter  does  not  discuss 
specific  algorithms  for  either  queries  or  extractions,  but  does  serve  to  set  the  stage  for  how 
these  algorithms  come  together  to  act  as  methods  in  this  Compute  Server.  The  Compute  Server 
is  capable  of  operating  in  a  distributed  fashion,  meaning  distributed  from  both  the  front  end 
and  from  the  graphics.  It  is  designed  to  be  able  to  communicate  through  pointers  if  all  pieces 
reside  on  the  same  machine,  either  through  a  single  processor,  or  through  a  shared  memory' 
arena.  It  is  also  designed  to  allow  for  distributed  communication  across  a  network  through 
the  Remote  Procedure  Calling  (RPC)  protocol  using  the  External  Data  Representation  (XDR) 
library  that  represents  data  structures  in  a  machine-independent  form  [46].  RPC  provides  ability 
to  communicate  with  procedures  or  processes  outside  of  an  application’s  current  address  space. 
This  allows  a  local  program  to  execute  a  procedure  on  a  remote  machine,  passing  data  to  it 
and  receiving  data  from  it.  Using  the  RPC  protocol,  the  Compute  Server  can  reside  on  a 
geographically  distant  machine  from  a  front-end  graphical  display.  This  makes  the  capability 
of  the  Compute  Server  much  more  attractive  because  this  design  directly  addresses  the  concern 
of  having  to  invest  in  expensive  front-end  graphical  displays  with  extraordinary  amounts  of 
memory.  Eventually,  the  size  of  the  problem  would  prohibit  having  to  have  the  Compute 
Server  operate  in  the  same  address  space  as  the  graphical  front-end.  The  Compute  Server 
contains  methods  that  allow  for  query  capability,  short  burst  questions  that  return  reasonably 
small  answers,  and  for  extractions.  Extractions  are  used  to  reduce  the  focus  of  interest  from  the 
entire  data  set  to  regions  defined  by  a  set  of  inputs.  These  extractions  can  be  cutting  planes, 
isosurfaces,  particle  traces,  boundary  surfaces,  and  in  the  most  extreme  sense,  the  volume  itself. 
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Algorithms  to  perform  both  queries  and  extractions  will  be  discussed  in  more  detail  in  the  next 
chapter.  The  following  section  provides  an  explanation  of  each  of  the  classes  that  together  make 
up  the  Compute  Server. 


6.3  Class  Definitions 

The  classes  are  presented  by  discussing  the  rudimentary  framework  classes  and 
administrative  classes  first.  This  gives  an  idea  of  how  the  entire  Compute  Server  is  put  together 
to  form  a  library  of  routines  that  work  together  to  achieve  a  common  set  of  goals.  Presented 
after  that  are  the  grid  classes  that  house  every  type  of  grid  that  may  be  supported  in  this  current 
implementation.  Grid  components  classes  are  discussed  to  define  how  base  level  extractions  are 
made  from  the  grids  themselves  that  are  native  to  specified  grid  types.  Grouping  classes  are 
shown  for  facilitating  multiple  grid  capabilities.  Function  value  classes  explain  how  both  scalar 
and  vector  data  are  handled.  Extraction  classes  define  what  types  of  extractions  are  possible 
through  the  current  Compute  Server.  Graphical  properties  classes  provide  an  introduction  to 
graphics  entities  that  are  more  naturally  housed  within  the  confines  of  the  Compute  Server. 

6.3.1  Framework  and  Administrative  Classes 
The  top  level  management  object  in  the  Compute  Server  is  called  the  csObject.  It  contains 
a  pointer  to  a  class  called  csLinkedLisi  and  a  void  pointer  to  a  rendering  object.  Because  the 
Compute  Server  is  designed  to  be  independent  of  a  graphical  language,  and  because  it  may 
operate  as  a  query  and  extraction  device  that  outputs  data  in  a  file  format  as  well  as  graphical, 
the  rendering  object  pointer  may  be  empty.  This  allows  the  Compute  Server  to  behave  in  a  batch 
type  format  that  requires  no  front  end  display  and  may  be  driven  by  a  command  line  interface. 
This  capability,  however  simple  it  may  be,  has  been  invaluable  in  generating  visualizations  for 
the  large  scale  data  sets  that  have  come  through  the  ERC.  It  is  not  feasible  to  expect  the  user  to 
sit  at  a  workstation  and  wait  for  extractions  to  take  place  as  they  are  being  displayed.  Rather, 
a  situation  in  which  a  batch  extraction  can  be  performed  and  written  to  disk  for  later  display 
has  been  very  useful  in  a  production  visualization  setting.  The  csLinkedList  class  embodies 
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the  concept  of  a  linked  list  of  character  strings  with  corresponding  void  pointers  to  entities  and 
their  corresponding  enumerated  types.  Each  entity  that  is  a  member  of  the  csObject  manager 
must  have  a  name  associated  with  it.  The  names  are  assigned  at  creation,  and  serve  as  a  unique 
identifier  for  each  entity.  The  csLinkedList  class  handles  the  creation  of  the  linked  list,  additions, 
deletions,  and  queries.  The  next  version  of  the  Compute  Server  will  allow  for  a  csHashTable 
class  to  handle  the  management  of  these  named  entities.  It  is  currently  encumbant  upon  the 
user  to  avoid  name  clashes.  Future  versions  will  provide  a  capability  to  detect  and  correct  name 
clashes.  The  named  entities  that  are  currently  supported  in  the  Compute  Server  are  groups,  grids, 
scalar  and  vector  functions,  blocks,  computational  surfaces,  boundary  surfaces,  cutting  planes, 
particle  traces,  isosurfaces,  color  bars  and  axes. 

6.3.2  Grid  Classes 

The  basic  entity  in  the  Compute  Server  is  the  grid.  The  function  of  this  set  of  classes  is 
to  operate  on  either  the  grids  themselves  or  to  operate  on  extractions  that  were  created  from 
these  grids.  For  this  reason,  much  care  and  thought  was  put  into  the  design  of  the  grid  classes 
themselves.  There  is  a  basic  csGrid  class  that  houses  a  set  of  base  functions  that  are  common  to 
all  types  of  grids  and  a  set  of  virtual  functions  that  must  be  defined  uniquely  inside  each  specific 
type  of  grid.  The  csGrid  contains  a  pointer  to  a  csObject,  the  manager  object.  It  also  contains 
the  name  of  tne  csGrid,  the  type,  the  dimension,  bounding  volume  information,  the  number  of 
points  inside  the  grid,  the  number  of  unique  elements  that  make  up  the  grid,  and  various  other 
internal  management  information.  A  csGrid  can  be  typed  as  unstructured,  structured,  mixed 
element,  scattered,  or  unknown.  The  dimension  is  2D,  3D,  or  4D.  The  implementations  of  2D 
and  3D  grids  are  done  separately  to  avoid  any  unnecessary  memory  allocations.  A  pointer  to  a 
csGroup  is  also  retained  to  allow  for  a  multiple  grid  capability.  This  is  useful  when  a  larger  grid 
is  decomposed  into  smaller  units  for  computational  purposes,  but  must  be  brought  back  together 
for  extractions  or  display. 
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6.3.2. 1  Virtual  Definitions 

A  set  of  virtual  functions  that  each  grid  must  facilitate  are  defined  in  the  csGrid  class 
explicitly.  These  virtual  functions  include  the  capability  to  get  element  information,  element 
neighbor  information,  obtain  an  element  that  contains  a  point,  create  the  necessary'  extractions  for 
an  initial  display,  calculate  cutting  planes,  and  create  isosurfaces.  These  virtual  functions  allow 
for  higher  level  methods  to  insulate  the  details  on  a  specific  grid  type  from  the  algorithm  itself. 
For  example,  the  fundamental  algorithm  for  calculating  particle  traces  is  the  same  in  the  abstract 
sense.  It  is  only  specific  to  a  grid  type  in  traveling  from  one  element  to  the  next  (getting  the 
element  neighbor),  interpolating  function  values  inside  a  particular  element,  etc.  The  presence 
of  virtual  functions  inside  the  csGrid  class  allow  the  formation  of  a  generic  csParticleTrace  class 
that  is  capable  of  calculating  particle  traces  on  a  variety  of  valid  grid  types.  The  virtual  functions 
inside  the  csGrid  class  allow  the  details  for  the  differences  in  behavior  of  each  grid  type  to 
be  encapsulated  inside  the  specific  grid  itself.  The  individual  grid  classes  that  are  supported 
are  structured  curvilinear,  unstructured  tetrahedral,  unstructured  mixed  element,  and  scattered 
data.  For  the  sake  of  accomplishing  goal  number  three,  minimizing  resident  memory  usage, 
each  of  the  supported  grid  classes  was  implemented  separately.  The  Field  Encapsulation  Library 
(FEL)  presented  in  Chapter  2  is  written  as  a  set  of  object  oriented  classes  as  well.  However,  it 
distinguishes  algorithmic  behavior  at  the  element  level  [47].  The  Compute  Server  distinguishes 
algorithmic  behavior  at  the  grid  level.  Computationally,  the  smallest  named  entity  is  a  grid, 
or  some  extraction  that  has  been  taken  from  a  grid.  This  treatment  of  class  management  is 
preferable  when  implementing  algorithms  such  as  querying  an  element  neighbor.  At  the  grid 
level,  it  is  possible  to  insulate  most  of  the  detail  inside  the  specific  grid  implementation  class 
itself,  and  allows  for  refinements  to  improve  both  speed  and  memory  usage.  A  structured 
curvilinear  grid  does  not  require  an  explicit  neighbor  map  inside  blocks.  However,  between 
block  boundaries,  an  explicit  neighbor  map  is  required  to  travel  from  block  to  block  seamlessly. 
FAST  uses  implicit  neighbor  mapping  for  traversal  inside  a  block,  but  stops  at  block  boundaries 
unless  an  EBLANKED  grid  has  been  input  that  explicitly  gives  block  to  block  mapping  [48]. 
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The  structured  curvilinear  grid  implemented  inside  the  Compute  Server  calculates  this  explicit 
neighbor  map  for  block  to  block  traversal  transparently  to  the  user.  In  an  unstructured  grid,  an 
explicit  neighbor  map  is  required  to  be  stored.  This  low  level  detail  is  transparent  to  the  user  by 
providing  these  virtual  functions  in  the  abstract  csGrid  class. 

6.3.3  Grid  Components  Classes 

There  are  lower  level  entities  that  form  regions  of  interest  in  a  grid  that  are  not  necessarily 
extractions  in  the  purest  sense.  For  structured  grids,  these  are  blocks  and  computational  surfaces. 
A  structured  grid  is  made  up  of  blocks  that  speak  in  terms  of  i,  j,  and  k  indices.  At  a  specified 
block,  i,  j,  or  k  index,  a  computational  surface  can  be  seen.  The  Compute  Server  allows  for 
these  entities  as  separate  classes  and  unless  the  user  specifically  queries  information  from  them: 
the  details  of  these  classes  are  transparent  to  the  user.  They  serve  to  encapsulate  the  user  from 
detail  that  is  not  critical  information.  The  language  in  an  unstructured  grid  is  very  different.  An 
unstructured  volume  grid  is  made  up  of  surface  grids  that  represent  boundary'  conditions  and 
the  field  itself.  Again,  this  information  is  hidden  from  the  user  unless  it  is  specifically  queried. 
These  entities  are  automatically  generated  when  the  csGrid  is  created.  Blocks  and  computational 
surfaces  are  generated  only  if  the  volume  grid  is  structured  curvilinear,  and  boundary  conditions 
and  boundary  surfaces  are  generated  only  the  volume  grid  is  unstructured. 

6.3.4  Grouping  Classes 

A  csDataGroup  class  is  available  for  grouping  instances  of  data  objects  that  represent 
grids,  solutions,  and  extractions.  These  entities  are  in  turn  pieces  that,  when  combined,  form 
a  large  grid.  The  large  superset  is  decomposed  into  smaller  units  to  enable  parallelizing  the 
computations  of  the  solution  values  at  the  grid  points.  When  trying  to  visualize  this  entity, 
it  can  be  input  as  separate  grid  and  solution  objects  that  must  be  grouped  to  provide  a  single 
representation  to  the  user.  The  csDataGroup  allows  the  melding  of  these  separate  data  objects 
into  a  single  group.  When  extractions  are  computed,  they  are  done  so  on  the  grouped  entity.  The 
user  does  not  carry  the  burden  of  having  to  manually  regroup  these  entities. 
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6.3.5  Function  Value  Classes 

Function  values  are  stored  in  the  Compute  Server  in  one  of  two  classes,  the  csScalar Function 
class  or  the  csVector Function  class.  The  two  classes  are  very  similar  in  nature.  They  have 
the  same  set  of  methods.  However,  the  csScalarFunction  manages  scalar  variables,  and  the 
csVectorF unction  manages  the  vector  variables.  Both  classes  store  the  function  name,  a  pointer 
to  the  associated  grid  class,  a  pointer  to  or  a  copy  of  the  actual  data  values  themselves,  depending 
on  the  type  of  communication,  and  extrema  information.  All  functions  are  required  to  have  the 
same  number  of  point  values  as  the  grid  that  it  is  associated  with.  The  Compute  Server  handles 
only  functions  that  come  in  with  a  grid.  The  grid  is  assigned  at  creation  and  must  be  specified  by 
the  user.  The  Compute  Server  does  checking  to  ensure  that  the  number  of  points  in  the  function 
match  the  number  of  points  in  the  specified  grid.  This  grid  is  also  required  to  be  created  previous 
to  the  creation  of  function  values. 


6.3.6  Extraction  Classes 

There  are  currently  three  types  of  extractions  that  are  fully  implemented  inside  the  Compute 
Server:  cutting  planes,  isosurfaces,  and  particle  traces.  The  details  of  the  algorithms  are 
discussed  in  the  next  chapter.  This  section  serves  to  point  out  that  the  management  of  the  specific 
properties  of  each  of  the  extractions  is  done  in  the  classes  themsel  es.  The  actual  traversal  or 
interpolation  that  is  performed  within  the  context  of  these  specific  algorithms  is  handled  by  the 
grid  on  which  the  extraction  is  being  calculated. 

6.3.7  Graphical  Properties  Classes 

Although  the  purpose  of  the  Compute  Server  is  not  necessarily  tied  to  a  graphics  system, 
and  does  not  have  to  be  used  to  do  any  post-processing,  there  is  a  need  to  keep  some  graphical 
information  present  in  the  Compute  Server.  This  graphical  information  is  an  overlap  with  the 
function  data.  For  example,  a  csColorBar  class  is  provided  to  manage  color  bar  information  that 
can  be  used  to  create  and  obtain  the  display  of  entities  using  a  user  specified  color  map.  The 
color  map  is  managed  like  all  other  entities  in  the  Compute  Server,  by  name.  Any  number  of 
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colors  can  be  specified  to  make  up  the  color  map.  Additionally,  a  csAxis  class  is  also  provided 
for  the  management  and  display  of  legends  and  axes  on  the  computational  data. 

6.4  Summary 

An  overview  of  both  the  design  and  the  implementation  of  the  Compute  Server  has  been 
given.  It  is  now  relevant  to  begin  discussing  how  the  classes  and  the  structure  of  these  classes 
inside  this  entity  facilitate  the  rapid  prototyping  of  efficient  methods  for  both  the  query  and 
extraction  of  information  from  the  given  input  data.  This  efficiency,  as  will  be  discussed 
in  the  next  chapter,  refers  to  both  efficient  memory  usage  and  efficient  computational  time 
performance. 


CHAPTER  Vn 


DESCRIPTION  AND  ANALYSIS  OF  THE  KEY  VISUALIZATION  TOOLKIT 

ALGORITHMS 

This  chapter  presents  and  provides  analysis  for  the  key  algorithms  that  make  up  the  Compute 
Server.  These  algorithms  have  been  refined  and  tailored  to  achieve  a  balance  between  both 
minimizing  resident  memory  and  optimizing  speed  performance.  The  concepts  behind  these 
algorithms  are  not  new.  Surface  extraction  and  visibility  ordering  of  unstructured  polyhedra  are 
not  new  to  the  visualization  community.  However,  these  algorithms  have  been  optimized  for 
performance  in  a  computational  setting  for  the  purposes  of  feature  extraction  and  visualization 
of  large  scale  unstructured  scientific  data  sets.  These  uniquely  optimized  algorithms  placed  in 
the  framework  discussed  in  the  previous  chapter  make  this  work  an  original  effort  to  fine  tune 
the  basic  set  of  algorithms  needed  to  compute  and  display  features  and  extractions  from  large 
scale  CFD  data  sets.  This  effort  has  certainly  been  a  creative  endeavor  to  investigate  the  inner 
workings  of  all  components  of  the  Compute  Server.  These  components  include  the  core  data 
structures,  search  algorithms  that  use  these  data  structures,  and  extractions  that  use  both.  Th 
work  was  conducted  as  a  result  of  investigating  the  current  state-of-the-art  and  realizing  that 
many  of  the  existing  techniques  do  not  clearly  delineate  where  the  cost/performance  line  is.  This 
is  because  many  existing  systems  attempt  to  create  a  general  framework  and  a  general  set  of 
algorithms  that  is  capable  of  handling  any  type  of  input  data.  The  algorithms  presented  in  this 
chapter  have  not  thoroughly  been  investigated  on  all  possible  types  of  input  data;  rather,  the 
scope  has  been  limited  to  include  only  data  from  large  scale  unstructured  CFD  simulations. 

7.1  Data  Structures 

Although  data  structures  are  a  vital  component  in  any  algorithm,  they  are  often  the  most 
overlooked  component.  An  inappropriate  choice  for  basic  data  structures  will  often  lead  to  either 
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continuous  algorithmic  reworking  to  overcome  this,  or  as  is  often  the  case,  a  total  rewrite  of  the 
software  with  different  data  structures  that  have  been  found  to  be  more  applicable  to  the  problem 
at  hand.  Given  the  premise  that  a  balance  must  be  achieved  between  both  minimizing  resident 
memory  and  optimizing  speed  performance,  it  was  clear  that  the  data  structures  had  to  be  simple 
enough  to  minimize  any  overhead  needed  for  constructs  such  as  pointers,  structures,  classes,  etc. 
The  following  sections  outline  the  data  structures  that  serve  as  the  basis  for  all  searching  and 
unstructured  grid  traversal. 


7.1.1  Defined  Data  Types 

The  Compute  Server  operates  on  a  set  of  predefined  data  types  [49].  These  data  types  are 
used  throughout  all  of  the  algorithms.  Although  the  algorithms  for  traversing  and  visualizing 
unstructured  volumes  can  be  extremely  complex,  the  data  structures  need  not  be.  In  fact,  to 
accomplish  the  goal  of  minimizing  memory  usage,  all  significantly  sized  data  is  placed  in  a 
one-dimensional  array.  The  defined  types  shown  in  Table  7.1  show  how  these  one-dimensional 
arrays  can  be  type  defined.  Placing  large  amounts  of  data  in  one-dimensional  arrays  eliminates 
any  overhead  in  pointer  allocations.  Additionally,  the  manner  in  which  these  items  are  typed 
facilitates  ease-of-use  and  improved  readability  of  the  code.  The  data  placed  in  these  typed 
arrays  may  be  t  .cessed  as  if  they  were  allocated  for  its  respective  dimension.  For  example, 
an  item  vertex  at  index  i  in  an  array  of  FLOAT  3D  values  can  be  accessed  as  vertex[i][0], 
vertex[i][l],  and  vertex[i][2].  If  an  array  of  FLOAT3D  values  is  allocated  for  N  values,  then 
the  amount  of  memory  used  is  exactly  3*N.  There  is  no  overhead  for  pointers  and  no  need  for 
double  dereferencing. 

7.1.2  Array  Indexing 

All  of  the  algorithms  that  are  presented  in  this  work  operate  on  a  variety  of  type  defined  one¬ 
dimensional  arrays.  The  manner  in  which  they  operate  on  these  arrays  has  a  common  theme  [50], 
This  theme  will  be  discussed  using  the  structures  needed  to  compute  the  elements  surrounding  a 
point.  This  algorithm  is  presented  in  detail  in  the  next  section. 
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Table  7.1 :  Defined  Data  Types  in  the  Compute  Server 


typedef  char 
typedef  int 
typedef  int 
typedef  int 
typedef  int 
typedef  int 
typedef  int 
typedef  int 
typedef  int 
typedef  float 
typedef  float 
typedef  float 
typedef  double 
typedef  double 
typedef  double 


CHAR_21  [21]; 

INT.1D; 

INT_2D[2]; 

ENT_3D[3]; 

INT_4D[4]; 

INT_5D[5]; 

INT_6D[6]; 

INT_7D[7]; 

INT_8D[8]; 

FLOAT_l  D; 

FLOAT_2D[2]; 

FLOAT_3D[3]; 

DOUBLE-ID; 

DOUBLE_2D[2]; 

DOUBLE_3D[3]; 


The  following  lines  of  code  show  the  arrays  needed  to  compute  and  store  the  elements 
surrounding  a  point.  tNESP  is  used  as  a  temporary  array  needed  to  construct  nESP.  nESP  is  the 
actual  storage  needed  to  construct  the  eSP  array.  eSP  contains  the  actual  elements  surrounding 
each  given  point. 

INT_1D  *tNESP  =NULL;  // A  temporary  array  for  constructing  nESP. 

INT-1D  *nESP  =  NULL;  //  Contains  information  for  number  elements  surrounding  a  point. 
INT.ID  *eSP  =NULL;  // Contains  the  specific  elements  surrounding  each  point. 

The  first  operation  in  the  creation  of  the  elements  surrounding  a  point  is  to  allocate  both  nESP 
and  tNESP.  tNESP  is  allocated  after  nESP  because  it  will  be  immediately  deallocated  once  nESP 
has  been  fully  constructed.  Each  of  these  arrays  is  allocated  to  be  of  length  numberOfNodes+1 . 
Each  index  in  both  arrays  is  initialized  to  zero. 


nESP  =  new  INT-lD[numberOfNodes+l];  //  Allocate  memory  for  actual  array. 
tNESP  =  new  INT_lD[numberOfNodes+l];  //  Allocate  memory  for  temporary  array. 
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for  (i  =  0;  i  <=  numberOfNodes;  i++)  //  For  i  cycles  over  all  nodes  in  grid  plus  one. 

{ 

tNESP[i]  =  0;  //  Initialize  temporary'  array  to  contain  zeros. 

nESP[i]  =  0;  //  Initialize  actual  array  to  contain  zeros. 

}  //  End  cycle  over  all  nodes  in  grid  plus  one. 


A  pass  is  then  made  over  all  of  the  elements  to  increment  the  values  in  tNESP.  After  this 
loop,  the  values  in  tNESP  reflect  the  actual  number  of  elements  surrounding  each  point. 


for  (i  =  0;  i  <  numberOfElements;  i++)  //  Cycle  over  all  elements. 

{ 

for  (j  =  0;  j  <  numNodesInElement;  ]++)//  Cycle  over  all  nodes  in  element  i. 
tNESP[element[i][j]]++;  //  Increment  tNESP[i]. 

} 


Once  the  arrays  have  been  allocated  and  initialized,  and  the  number  of  elements  surrounding 
each  point  in  the  input  grid  has  been  set,  nESP  is  updated  to  act  as  an  index  into  the  eSP  array. 


for  (i  =  1 ;  i  <=  numberOfNodes;  i++)  //  For  i  cycles  over  all  nodes  plus  one  starting  at  one. 
nESP[i]  =  nESP[i-l]  +  tNESP[i-l];  //Increment  nESP[i]  to  indicate  all  elements 

//  surrounding  a  point  up  to  that  point. 

delete[]  tNESP;  //  Deallocate  the  memory  needed  for  this  temporary  variable. 


The  last  pass  is  made  to  loop  back  over  all  nodes  of  every  element  in  the  input  grid.  During 
this  pass,  an  entry  is  made  for  each  element  index  into  the  eSP  array.  This  operation  adds  a  given 
element  i  to  the  list  of  elements  surrounding  the  points  that  make  up  the  construction  of  the  eSP 
array. 


eSP  =  new  INT_lD[nESP[numberOfNodes]];  //  Allocate  memory  for  content, 
for  (i  =  0;  i  <  numberOfElements;  i++)  //  Cycle  over  all  elements. 

{ 

for  (j  =  0;  j  <  numNodesInElement;  j++) 

{ 


//  Cycle  over  all  nodes  in  element  i. 
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plndex  =  element[i]|j]; 
elndex  =  nESP[pIndex]; 
eSP[eIndex]  =  i; 
nESP[pIndex]++; 

} 

} 


//  Dereference  the  point  index. 

//  Dereference  the  element  index. 
//  Add  the  element  i  to  eSP. 

//  Increment  index  nESP[pIndex] 


An  example  of  this  set  of  operations  is  shown  in  Figure  7.1.  Part  A  shows  tNESP  after  the 
first  pass.  Each  of  the  values  in  tNESP  reflects  the  number  of  elements  surrounding  that  point. 
tNESP[0]  =  4  indicates  that  point  0  has  4  elements  surrounding  it.  Part  B  is  an  example  of  what 
nESP  contains  after  it  is  updated.  At  this  point  nESP[0]  reflects  that  point  0  has  a  beginning  index 
of  0  into  the  eSP  array.  nESP[l]  =  4  indicates  that  point  1  has  a  beginning  index  of  4  into  the  eSP 
array.  Part  C  illustrates  the  contents  of  both  nESP  and  eSP  after  the  construction  phase.  nESP[0] 
=  4  indicates  the  ending  index  into  eSP  for  all  elements  surrounding  point  0.  Because  we  are 
at  the  first  point  in  nESP,  the  beginning  index  into  eSP  is  0.  The  elements  surrounding  point  0 
are  contained  in  indices  0,  1,2,  and  3.  This  is  consistent  with  the  contents  of  eSP  containing 
elements  El,  E2,  E3,  and  E4.  nESPfl]  =  7  indicates  the  ending  index  into  eSP  for  all  elements 
surrounding  point  1.  The  beginning  index  into  eSP  for  point  1  is  nESP[0]  =  4.  Again,  this  is 
consistent  with  the  elements  E5,  E6,  and  E7,  starting  at  index  4  and  ending  at  index  6. 


(A)  tNESP  [4321  641  0] 

(B)  nESP  [0  4  79  10  1620  21] 

(C)  nESP  [4  7  9  10  16  20  21  21] 

eSP  [E1E2  E3E4  E5  E6  E7  E8  E9  . . .] 

PI  P2  P3 

Figure  7.1:  Index  Arrays  Used  to  Construct  Elements  Surrounding  a  Point 
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As  was  stated  earlier,  this  method  of  using  the  one-dimensional  arrays  is  one  that  is  used 
frequently  throughout  all  algorithms  in  the  Compute  Server.  The  next  section  presents  a  detailed 
algorithm  and  analysis  for  finding  the  elements  surrounding  each  point  in  the  input  grid. 

7.1.3  Elements  Surrounding  A  Point 

Before  any  of  the  actual  algorithms  can  be  performed,  a  set  of  base  data  structures  must  he 
created.  The  first  is  that  of  finding  and  recording  the  elements  surrounding  a  point  [50].  The 
concept  of  elements  surrounding  a  point  is  illustrated  in  Figure  7.2.  For  a  given  point  P,  the  set 
of  elements  surrounding  this  point  are  those  whose  faces  contain  the  point  P.  In  Figure  7.2,  the 
elements  surrounding  point  P  are  shown  to  be  elements  El,  E2,  E3,  E4,  E5,  and  E6. 


point  P 


ESP[P]  =  [El ,  E2,  E3,  E4,  E5,  E6] 

Figure  7.2:  Elements  Surrounding  a  Point 

The  following  algorithm  is  based  on  that  given  in  [50]: 

1.  Allocate  the  array  nESP  at  a  size  of  numberOfNodes+1.  nESP  is  the  array  that  will  be 
used  to  both  construct  and  index  into  the  eSP  array. 

2.  Allocate  tNESP  at  a  size  of  numberOfNodes+1.  tNESP  is  a  temporary  array  used  to 
construct  nESP. 

3.  Cycle  over  all  nodes  and  initialize  tNESP  and  nESP  to  zero. 

4.  Cycle  over  all  elements  in  the  input  grid. 
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•  Get  the  point  index  plndex  for  each  node  of  element  i. 

•  Increment  tNESP[pIndex]  by  one. 

5.  Cycle  over  all  nodes  i  in  the  input  grid  starting  at  one,  and  set 
nESP[i]=nESP[i- 1  ]+tNESP[i- 1  ]. 

6.  Delete  the  memory  allocated  for  tNESP. 

7.  Allocate  the  array  eSP  at  a  size  of  .rESPfnumberOfNodes]. 

8.  Cycle  over  all  elements  i  in  the  input  grid. 

•  Get  the  point  index  plndex  for  each  node  j  of  element  i. 

•  Let  elndex  =  nESP[pIndex]. 

•  Set  eSP[eIndex]  =  i. 

•  Increment  nESP[pIndex]  by  one. 

Both  nESP  and  tNESP  in  steps  1  and  2  are  allocated  to  be  of  length  numberOfNodes+1, 
and  are  thus  0{M  +  1)  in  memory  where  M  is  the  number  of  nodes  in  the  input  grid.  Step 
3  of  the  algorithm  cycles  over  all  nodes  and  thus  operates  in  0[M)  time.  The  loop  in  step  4 
cycles  over  all  nodes  in  all  elements  in  the  input  grid  making  it  0{aN)  in  time,  where  N  is  the 
number  of  elements  in  the  input  grid,  and  a  is  a  constant  that  represents  the  average  number  of 
nodes  per  element.  Because  we  only  deal  with  elements  whose  number  of  nodes  range  from 
four  for  tetrahedra  to  eight  for  hexahedr-,  the  constant  a  is  very  small  compared  to  N,  and  is 
in  the  worst  case  8.  Step  5  of  the  algorithm  operates  in  0(M  +  1)  time.  Step  6  is  the  point 
in  the  algorithm  where  the  temporary  memory  allocated  for  tNESP  can  be  released.  Because 
tNESP  was  allocated  after  nESP  and  no  additional  memory  has  been  allocated,  this  release  helps 
to  avoid  memory  fragmentation  issues.  Step  7  allocates  an  array  of  size  nESPfnumberOfNodes]. 
Although  we  do  not  know  the  exact  value  of  nESPfnumberOfNodes]  a  priori ,  we  can  easily  make 
a  reasonable  estimate  for  the  types  of  input  grids  in  this  research.  For  a  purely  tetrahedral  input 
grid  generated  for  the  purpose  of  solving  computational  fluid  dynamics  phenomena,  the  estimate 
would  be  24  *  M,  and  is  typically  smaller  than  this.  This  estimate  is  a  result  of  the  manner  in 
which  the  grid  must  be  constructed  to  ensure  reasonable  quality.  For  a  grid  constructed  solely  of 
six  noded  pentahedra  (prisms),  the  estimate  would  be  12  *  M.  A  purely  hexahedral  grid  would 


87 


give  an  estimate  of  8  *  M.  Again,  M  is  the  number  of  nodes  in  the  input  grid.  Given  the  above 
stated  numbers,  a  worst  case  estimate  of  the  amount  of  memory  allocated  for  eSP  is  0( 24  *  M). 
An  average  estimate  is  0(16  *  M).  We  can  easily  state  this  as  0{cM)  where  c  ranges  from  8 
to  24  for  a  mixed  element  input  grid,  and  M  is  the  number  of  nodes  in  the  input  grid.  Step  8 
operates  in  0{N)  time  where  N  is  the  number  of  elements  in  the  input  grid.  An  overall  analysis 
of  this  algorithm  tells  us  that  it  operates  linearly  in  both  space  and  time.  It  is  O(cM)  in  space 
and  O(aN)  in  time,  c  «  M  and  a  «  N.  The  detailed  implementation  of  this  algorithm  is 
given  in  Appendix  C. 


7.1.4  Localized  Decomposition  Into  Tetrahedra 

The  algorithms  that  are  to  be  presented  in  future  sections  will  operate  on  a  variety  of  element 
types.  As  has  been  previously  stated,  these  elements  are  tetrahedra,  pyramids,  prisms,  and 
hexahedra.  The  basic  type  of  element  that  all  of  the  algorithms  in  the  Compute  Server  operate 
on  is  the  tetrahedra.  All  other  types  of  elements  that  have  been  introduced  can  be  locally 
decomposed  into  tetrahedra,  and  then  each  tetrahedra  can  be  handled  similarly.  It  is  termed 
local  since  the  decomposition  is  performed  within  the  confines  of  the  algorithm.  No  additional 
memory  is  required,  and  the  operation  is  just  a  matter  of  indexing  appropriately  into  the  points 
making  up  the  global  element  being  decomposed.  An  example  of  the  d(  composition  of  a  pyramid 
is  illustrated  in  Figure  7.3.  Part  (A)  shows  the  original  pyramid.  Part  (B)  shows  how  the  pyramid 
can  be  decomposed  to  form  two  full  face  matching  tetrahedra.  The  first  tetrahedra  is  shown  as 
being  constructed  by  the  points  PI,  P4,  P3,  and  P5.  The  second  tetrahedra  is  shown  as  being 
constructed  by  the  points  PI,  P2,  P3,  and  P4.  Additionally,  Part  (C)  gives  an  exploded  view  of 
the  two  tetrahedra  that  are  combined  to  form  a  pyramid. 

The  decomposition  of  a  prism  is  shown  in  Figure  7.4.  The  original  prism  is  shown  in  part 
(A).  Part  (B)  illustrates  the  combined  view  of  the  three  tetrahedra  that  are  generated  as  a  result 
of  the  decomposition.  They  are  full  face  matching.  The  first  tetrahedra  is  constructed  from  the 
points  PI,  P2,  P3,  and  P4.  The  second  tetrahedra  is  constructed  from  the  points  P2,  P3,  P4,  and 
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PI  P2 

'  T1  =  PI  P2  P3  P4 


Figure  7.3:  Local  Decomposition  of  a  Pyramid  into  Tetrahedra 

P5.  The  third  tetrahedra  is  constructed  from  the  points  P3,  P4,  P5,  and  P6.  Part  (C)  gives  an 
exploded  view  of  the  three  tetrahedra. 


P6 


(A) 


P6 


P3 


Figure  7.4:  Local  Decomposition  of  a  Prism  into  '■’Jtrahedra 


The  decomposition  of  a  hexahedra  is  shown  in  Figure  7.5.  The  original  hexahedra  is  shown 
in  part  (B).  It  is  made  up  of  8  point  indices  PI,  P2,  P3,  P4,  P5,  P6,  P7,  and  P8.  A  hexahedra 
can  be  thought  of  as  being  divided  into  two  prisms  which  can  be  further  decomposed  into  three 
tetrahedra  each.  Part  (A)  illustrates  the  three  tetrahedra  that  are  a  result  of  subdividing  the  first 
prism.  The  first  tetrahedra  is  made  up  of  points  PI,  P3,  P4,  and  P7.  The  second  tetrahedra  is 
made  up  of  points  PI,  P7,  P4,  and  P8.  The  third  tetrahedra  is  made  up  of  points  PI ,  P7,  P8,  and 
P5.  Part  (C)  illustrates  the  three  tetrahedra  that  are  a  result  of  subdividing  the  second  prism.  The 
first  tetrahedra  is  made  up  of  points  PI,  P2,  P3,  and  P5.  The  second  tetrahedra  is  made  up  of 
points  P2,  P3,  P5,  and  P6.  The  third  tetrahedra  is  made  up  of  points  P3,  P5,  P6,  and  P7. 
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(C) 

T3  =  P3  P5  P6  P7 


T2  =  P2  P3  P5  P6 


P3 


Figure  7.5:  Local  Decomposition  of  a  Hexahedra  into  Tetrahedra 


7.1.5  Element  Neighbors 

To  conclude  the  presentation  of  the  basic  data  structures,  the  creation  of  an  element  neighbor 
map  is  presented  [50].  Before  discussing  the  specific  algorithm,  the  concept  of  an  element 
neighbor  is  introduced.  Figure  7.6  displays  the  element  neighbors  for  a  tetrahedral  element. 
Element  neighbor  one,  Nl,  is  the  element  ‘hat  contains  a  face  that  matches  points  PI,  P2,  and 
P3.  Element  neighbor  two,  N2,  is  the  element  that  contains  a  face  that  matches  points  P2,  P3, 
and  P4.  Element  neighbor  three,  N3,  is  the  element  that  contains  a  face  that  matches  points  P3, 
P4  and  PI.  Element  neighbor  four,  N4,  is  the  element  that  contains  a  face  that  matches  points 
P4,  PI  and  P2. 

Figure  7.7  displays  the  element  neighbors  for  a  pyramid.  Element  neighbor  one,  Nl,  is  the 
element  that  contains  a  face  that  matches  points  PI,  P2,  and  P3.  Element  neighbor  two,  N2,  is 
the  element  that  contains  a  face  that  matches  points  P2,  P3,  and  P4.  Element  neighbor  three,  N3, 
is  the  element  that  contains  a  face  that  matches  points  P3,  P4  and  P5.  Element  neighbor  four, 
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Figure  7.6:  Element  Neighbors  For  a  Tetrahedra 

N4,  is  the  element  that  contains  a  face  that  matches  points  PI ,  P3  and  P5.  Element  neighbor  five, 
N5,  is  the  element  that  contains  a  face  that  matches  points  PI,  P2,  P4,  and  P5. 


Figure  7.7:  Element  Neighbors  For  a  Pyramid 

Figure  7.8  displays  the  element  neighbors  for  a  prism.  Element  neighbor  one,  Nl,  is  the 
element  that  contains  a  face  that  matches  points  PI,  P2,  and  P3.  Element  neighbor  two,  N2,  is 
the  element  that  contains  a  face  that  matches  points  P2,  P3,  P6,  and  P5.  Element  neighbor  three, 
N3,  is  the  element  that  contains  a  face  that  matches  points  P4,  P5  and  P6.  Element  neighbor  four, 
N4,  is  the  element  that  contains  a  face  that  matches  points  PI,  P3,  P6  and  P4.  Element  neighbor 
five,  N5,  is  the  element  that  contains  a  face  that  matches  points  PI ,  P2,  P5,  and  P4. 

Figure  7.9  displays  the  element  neighbors  for  a  hexahedral  element.  Element  neighbor  one, 
Nl,  is  the  element  that  contains  a  face  that  matches  points  PI,  P2,  P3,  and  P4.  Element  neighbor 
two,  N2,  is  the  element  that  contains  a  face  that  matches  points  P2,  P3,  P7,  and  P6.  Element 
neighbor  three,  N3,  is  the  element  that  contains  a  face  that  matches  points  P5,  P6  P7,  and  P8. 
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Figure  7.8:  Element  Neighbors  For  a  Prism 

Element  neighbor  four,  N4,  is  the  element  that  contains  a  face  that  matches  points  PI,  P4,  P8 
and  P5.  Element  neighbor  five,  N5,  is  the  element  that  contains  a  face  that  matches  points  P3, 
P4,  P8,  and  P7.  Element  neighbor  six,  N6,  is  the  element  that  contains  a  face  that  matches  points 
PI,  P2,  P6,  and  P5. 


Figure  7.9:  Element  Neighbors  For  a  Hexahedra 

The  algorithm  to  construct  the  element  neighbor  map  is  based  on  the  algorithm  that  was 
presented  in  [50].  It  assumes  that  the  map  containing  the  elements  surrounding  a  point  has 
already  been  constructed. 

1 .  Allocate  the  array  eN  that  will  contain  the  element  neighbor  information. 
eN  =  new  INT_6D[numElements]. 

2.  Cycle  over  all  neighbors  j  of  all  elements  i  and  initialize  each  entry  in  eN. 
eN[i][j]  =  -555. 
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3.  Cycle  over  all  neighbors  j  for  all  elements  i  in  the  input  grid. 

If  eN[i][j]  ==  -555  (the  element  neighbor  has  not  been  set), 

(a)  Find  the  element  ne  and  the  face  nf  that  matches  element  i  at  face  j. 

(b)  Set  eN[i][j]  =  ne. 

(c)  if  ne  !=  -999,  eN[ne][nf]  =  i.  (If  ne  is  -999,  then  the  neighbor  for  element  i  through 
face  j  is  a  boundary  face.  Otherwise,  this  value  is  a  valid  element  neighbor.) 

Step  1  of  the  algorithm  shows  us  that  the  amount  of  memory  required  to  construct  the 
element  neighbor  map  is  0(6  *  N)  where  N  is  the  number  of  elements  in  the  input  grid.  This 
version  of  the  algorithm  assumes  that  we  are  dealing  with  the  mixed  element  type  grid.  One 
array  is  used  to  house  the  entire  element  neighbor  map  in  a  mixed  element  grid.  It  is  sized 
according  to  the  element  with  the  most  number  of  neighbors,  the  hexahedra.  If  the  input  grid 
is  purely  tetrahedral,  then  the  size  of  the  memory  is  actually  0(4  *  A7),  resulting  in  no  wasted 
memory.  There  is  only  a  waste  of  memory  when  dealing  with  mixed  element  grids.  This  can 
easily  be  corrected  by  keeping  track  of  separate  structures  for  tetrahedra,  pyramids,  prisms, 
and  hexahedra.  For  demonstration  purposes,  we  will  keep  a  single  structure  and  accept  the 
memory  overhead.  Step  2  cycles  over  all  neighbors  of  all  elements  and  initializes  all  entries 
in  the  element  neighbor  structure.  This  loop  operates  in  0(6  *  N)  in  the  worst  case.  Step  3 
again  cycles  over  all  neighbors  in  all  elements.  If  the  element  neighbor  has  not  been  set,  then 
the  first  operation  is  to  find  the  element  containing  a  face  that  matches  the  current  element  and 
face  at  the  nodes.  Finding  this  common  element  operates  in  0(c)  time  where  c  is  the  maximum 
number  of  elements  surrounding  the  nodes  making  up  the  face  in  question.  As  was  stated  in 
the  section  discussing  elements  surrounding  a  point,  a  reasonable  worst  case  estimate  for  c  is 
approximately  24.  An  average  value  would  be  somewhere  in  the  range  8  to  24,  making  c  very 
small  in  comparison  to  the  number  of  nodes  in  the  input  grid.  The  other  operations  in  the  loop 
in  step  3  operate  in  constant  time,  making  the  loop  in  step  3  operate  in  a  total  time  of  0(cN). 
Constructing  the  element  neighbor  map  is  performed  exactly  once  for  a  given  input  grid.  As  long 
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as  the  grid  connectivity  does  not  change,  the  element  neighbor  map  is  valid.  The  algorithms  that 
are  presented  in  future  sections  require  the  use  of  this  element  neighbor  map.  After  the  element 
neighbor  map  is  constructed,  the  memory  needed  for  elements  surrounding  a  point  is  deallocated. 
None  of  the  algorithms  in  the  Compute  Server  require  explicit  use  of  the  elements  surrounding  a 
point.  Thus,  at  this  point,  the  required  memory  returns  to  0(6  *  N)  where  A*  is  the  total  number 
of  elements  in  the  input  grid.  A  full  implementation  of  the  construction  of  the  element  neighbor 
map  is  shown  in  Appendix  D. 


7.2  Searching 

Equal  to  the  concern  for  the  construction  and  memory  usage  needed  for  the  base  data 
structures  is  the  concern  for  traversing  the  unstructured  grid.  Almost  every  algorithm  in  the 
Compute  Server  that  performs  a  significant  task  requires  some  sort  of  searching  to  be  done.  The 
discussion  of  searching  is  shown  in  two  parts.  First,  a  presentation  of  a  volumetric  chunking 
algorithm  is  presented  and  analyzed.  Second,  the  actual  searching  schemes  which  use  this 
volumetric  chunking  are  presented  and  analyzed. 

7.2.1  Volume  Chunking 

Because  searching  is  such  a  significant  part  of  the  navigation  and  display  of  unstructured 
volumes  a  great  deal  of  time  was  spent  investigating  existing  methods  for  searching  unstructured 
grids,  and  for  possible  new  ideas  to  reduce  both  memory  overhead  and  search  time.  Additionally, 
a  significant  problem  occurs  when  searching  through  a  volume  that  contains  embedded 
boundaries.  Using  traditional  searching  techniques  for  unstructured  grids  [50],  the  search  can 
hit  an  embedded  boundary  during  the  traversal  and  exit  out  of  the  search,  forcing  a  global  search 
of  all  elements  in  the  grid.  This  situation  is  illustrated  in  Figure  7.10.  The  starting  element  is 
shown  as  being  located  in  front  of  the  embedded  boundary;  the  actual  element  containing  the 
given  point  is  located  behind  the  embedded  boundary.  Traditional  searching  techniques  navigate 
the  search  in  the  direction  vector  from  the  starting  element  towards  the  given  point.  At  some 
point  in  the  search,  the  traversal  collides  with  the  embedded  boundary,  and  fail;  this  forces  a 
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brute  force  search  of  all  unvisited  elements.  This  is  obviously  an  undesirable  situation.  For  this 
reason,  much  work  has  been  done  to  impose  an  additional  structure  on  the  existing  data.  Current 

methods  include,  but  are  not  limited  to,  the  creation  of  additional  structures  such  as  octrees, 

\ 

range  trees,  interval  trees,  etc.  These  structures  are  generally  used  to  house  or  categorize  the 
original  data. 


Element 


Figure  7. 10:  A  Common  Problem  With  Embedded  Boundaries  and  Traditional  Search 
Techniques 

Methods  that  use  octrees  are  perhaps  the  simplest  to  create  and  understand,  and  several 
versions  of  building  and  traversing  octrees  exist  today  [51],  [52],  [53].  Wilhelms  and  Van  Gelder 
[54]  use  a  branch-on-need  octree  to  purge  subvolumes  in  the  creation  of  isosurfaces.  This  method 
has  a  worst  case  time  efficiency  of  0(k  +  k  log (n/k))  where  n  is  the  total  number  of  cells,  and  k 
is  the  number  of  active  cells  [55].  This  method  applies  only  to  structured  data  sets,  and  requires 
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significant  changes  to  allow  for  unstructured  input  data.  Octrees  have  been  primarily  applied  to 
structured  grids  and  are  not  easily  adapted  to  deal  with  unstructured  grids  [55], 

Range-based  methods  apply  to  both  structured  and  unstructured  data,  but  are  generally  more 
suitable  for  unstructured  data  because  they  are  unable  to  exploit  implicit  adjacency  information 
found  in  structured  data  sets.  Range-based  methods  have  higher  memory  requirements. 
Gallagher  [56]  proposes  a  method  based  on  a  subdivision  of  the  range  domain  into  buckets, 
and  on  a  classification  of  intervals  based  on  the  buckets  they  intersect.  The  tradeoff  between 
efficiency  and  memory  requirements  is  highly  dependent  on  the  resolution  of  the  bucketing 
structure  [57],  Giles  and  Haimes  [58]  report  an  approach  in  which  two  sorted  lists  of  intervals  are 
constructed  in  a  preprocessing  phase  by  sorting  the  cells  according  to  some  predefined  minimum 
and  maximum  function  value.  This  method  exploits  the  concept  of  global  coherence.  More 
recently,  Shen  and  Johnson  [59]  try  to  improve  and  overcome  some  of  the  limitations  presented  in 
[56]  and  [58]  by  adopting  similar  structures  to  address  global  coherence.  However,  a  worst  case 
computational  complexity  of  0{N)  has  been  estimated  for  all  range-based  methods  discussed 
above  [55], 

Livnat  [55]  introduced  the  concept  of  a  span  space,  a  two-dimensional  space  where  each 
point  corresponds  to  an  interval  in  the  range  domain.  The  span  space  is  useful  to  geometrically 
understand  range-based  methods.  A  kd-tree  is  used  to  locate  the  active  intervals  in  this  space, 
achieving  an  0{y/n  +  k)  time  complexity  in  the  worst  case.  In  a  more  recent  paper,  Shen  [60] 
proposed  the  use  of  a  uniform  grid  to  locate  the  active  intervals  in  the  span  space.  The  overhead 
memory  required  to  impose  this  kd-tree  is  approximately  25%  above  the  memory  required  to 
house  the  original  grid  itself,  and  in  many  case  greater  than  25%. 

During  the  course  of  this  research,  both  the  octree  method  and  the  range  tree  method  were 
implemented  and  tested  to  determine  usability.  It  was  found  that  the  memory  overhead  required 
for  both  the  octree  and  the  range  tree  outweighed  any  potential  gain  in  both  a  global  and  local 
searching  situation.  The  branch-on-need  octree  given  by  Wilhelms  and  van  Gelder  stated  an 
overhead  of  approximately  20%.  The  kd-tree  imposed  on  the  original  grid  that  was  presented 
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above  showed  an  overhead  of  over  25%  above  the  memory  required  for  the  input  data.  For  small 
problems,  this  overhead  is  not  a  significant  problem.  However,  for  large  data  sets  such  as  those 
in  this  work,  an  overhead  of  greater  than  20%  can  mean  the  difference  between  being  able  to 
visualize  or  analyze  the  problem  and  not  being  able  to.  For  this  reason,  a  simpler  data  structure 
was  adopted  to  impose  additional  structure  information  on  the  input  grid,  while  maintaining  a 
memory  overhead  of  less  than  20%.  This  data  structure  and  its  use  is  termed  “volume  chunking”. 
Given  a  number  of  x  divisions,  y  divisions,  and  z  divisions,  a  volume  can  be  decomposed  into 
subvolumes  or  chunks.  A  very  simple  version  of  a  volume  chunking  scheme  applied  to  a  cubic 
volume  is  shown  in  Figure  7.1 1.  In  this  example,  the  number  of  x  divisions,  y  divisions  and  z 
divisions  is  equal  to  two. 


Figure  7.1 1:  A  Cube  Chunked  Into  Eight  Subvolumes 


The  algorithm  for  creating  the  volume  chunks  is  given  as: 


1 .  Compute  the  bounding  box. 

2.  Find  x,  y,  and  z  increments  based  on  the  number  of  x  divisions,  the  number  of  y  divisions, 
and  the  number  of  2  divisions.  These  values  can  be  user  supplied  or  can  be  computed  by 
decomposing  the  volume  based  on  a  desired  number  of  elements  in  each  volume.  Then 
num  Volumes  =  x  divisions  *  y  divisions  *  2  divisions. 
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3.  Allocate  the  structure  numElementsIn Volume. 
numElementsIn Volume  =  new  INT.lD[numVolumes]. 

4.  Allocate  temporary  tNumElementsIn  Volume  for  construction  of  numElementsIn  Volume. 
tNumElementsIn Volume  =  new  INT_lD[numVolumes]. 

5.  Cycle  over  all  volumes  and  initialize  tNumElementsIn  Volume. 

6.  Cycle  over  all  elements  i  in  the  input  grid. 

(a)  Calculate  the  volume  index  (vlndex)  for  the  first  node. 

(b)  Increment  tNumElementsIn Volume[vIndex]  by  one. 

7.  Cycle  over  all  volumes  i  and  set:  numElementsIn Volume[i]  =  numElementsIn Volume[i- 
1  ]+tNumElementsInVolume[i- 1  ] . 

8.  Delete  the  memory  for  structure  tNumElementsIn  Volume. 

9.  Allocate  elementsln Volume. 

elementsln Volume  =  new  INT_lD[numElements], 

10.  Cycle  over  all  elements  i  in  the  input  grid. 

(a)  Calculate  the  volume  index,  vlndex,  for  the  first  node  of  the  element  i. 

(b)  Set  elementsln  Volume[numElementsInVolume[vIndex]]  =  i. 

(c)  Increment  numElementsIn Volume[vIndex]  by  one. 


The  method  of  volume  chunking  uses  the  same  array  indexing  scheme  as  that  shown  in 
the  construction  of  elements  surrounding  a  point.  Steps  3  and  4  of  the  algorithm  shown  above 
allocate  the  amount  of  memory  needed  to  construct  the  volume  chunking  array.  This  memory  is 
seen  to  be  0(V  +  1)  where  V  is  the  number  of  subvolumes  or  volume  chunks.  Step  5  cycles 
over  all  of  these  volumes  in  0(V)  time.  Step  6  operates  in  order  O(N)  time  where  N  is  the 
number  of  elements  in  the  input  grid.  Step  7  operates  in  0(V)  time.  Step  9  allocates  the  memory 
needed  for  the  volume  chunking  structure  O(N)  space.  Step  10  operates  in  O(N)  time.  This 
results  in  an  overall  space  usage  of  0(N  +  V)  and  a  construction  time  of  O(N) .  Given  the  space 
requirements  for  the  original  grid,  the  additional  space  required  to  build  the  volume  chunking 
structure  is  an  overhead  of  approximately  18%.  This  overhead  falls  well  below  any  that  have 
been  mentioned  in  the  schemes  presented  above.  The  volume  chunking  scheme  easily  handles 
the  problem  of  getting  lost  during  the  traversal.  The  searching  scheme  in  the  next  section  uses 
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this  volume  chunking  to  select  a  good  starting  element  to  begin  the  search.  If  the  traversal 
gets  lost  during  the  search,  the  routine  cycles  over  all  elements  in  a  given  volume.  The  volume 
chunking  structure  is  specifically  used  to  obtain  a  good  candidate  starting  element  in  a  search  if 
one  is  not  already  available.  It  is  also  used  to  reduce  the  number  of  possible  elements  to  search 
through  should  an  embedded  boundary  be  contained  in  a  given  volume  chunk.  A  discussion 
and  analysis  of  this  volume  chunking  method  for  searching  is  presented  in  the  next  section.  An 
implementation  of  the  volume  chunking  is  given  in  Appendix  E. 


T1 :  x  index  =  0 

y  index  =  0 
z  index  =  2 
v  index  =  0  +  0  +  2 

Figure  7.12:  An  Example  of  Volume  Chunking 

The  process  of  finding  the  chunk  that  a  given  point  is  located  in  is  a  direct  computation. 
Figure  7.12  illustrates  the  chunking  of  space  that  contains  both  a  tetrahedra  and  a  hexahedra.  To 
find  the  chunk  that  the  node  T1  belongs  to,  a  calculation  is  made  to  determine  the  x  index,  the  y 
index,  the  z  index,  and  the  volume  index.  They  are  computed  in  the  following  manner: 
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getBoundingBox(Min,Max); 

xlndex  =  (int)(((x-Min[0])/(MaxtO]-Min[0]))*numXDivisions); 
ylndex  =  (int)(((y-Min[l])/(Max[l]-Min[l]))*numYDivisions); 
zlndex  =  (int)(((z-Min[2])/(Max[2]-Min[2]))*numZDivisions); 
if  (xlndex  ==  numXDivisions) 
xlndex-; 

if  (ylndex  ==  numYDivisions) 
ylndex-; 

if  (zlndex  ==  numZDivisions) 
zlndex-; 

vlndex  =  xIndex*numYDivisions*numZDivisions  + 
yIndex*numZDivisions  + 
zlndex; 


7.2.2  Global  and  Local  Searching  Techniques 
Searching  is  used  for  a  variety  of  reasons  during  the  process  of  visualizing  scientific  data,  but 
the  primary  reason  for  searching  inside  the  Compute  Server  is  to  determine  element  containment 
for  a  given  point.  The  searching  algorithm  inside  the  Compute  Server  operates  in  two  basic 
modes,  global  searching  and  local  searching.  The  global  searching  technique  uses  the  volume 
chunking  data  structure  and  the  given  point  to  locate  a  good  starting  element.  Then  the  local 
search  is  invoked.  If  the  local  search  is  unable  to  find  the  element  containing  a  given  point,  then 
all  elements  within  the  chunk  are  tested  to  see  if  the  point  is  contained  within  any  element  in  the 
grid.  Although  this  is  an  exhaustive  search  of  the  volume  chunk,  the  number  of  elements  inside 
that  volume  chunk  is  much  smaller  than  the  number  of  elements  in  the  input  grid.  There  are  only 
two  reasons  why  the  local  search  would  fail  to  find  an  element  containing  a  given  point:  (1)  the 
given  point  is  outside  of  the  volume  or  inside  a  cavity  that  contains  no  elements  and  (2)  there 
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exists  an  embedded  boundary  in  the  volume  chunk  that  was  chosen  to  contain  the  give  point. 
Presented  next  is  the  method  for  using  volume  coordinates  to  determine  search  direction  and 
element  containment,  the  base  searching  algorithm,  and  a  recursive  local  searching  algorithm. 

7.2.2. 1  Using  Volume  Coordinates  To  Determine  Point  Containment  »■ 

The  calculation  of  volume  coordinates  is  used  to  determine  point  containment  and  the  sub 

volumes  that  are  calculated  are  used  to  determine  direction  of  traversal.  This  method  is  based  on 
that  given  in  [50].  A  full  implementation  is  given  is  Appendix  F.  Figure  7.13  illustrates  how  the 
computations  of  the  subvolume  VI ,  V2,  V3,  and  V4  guide  the  traversal  through  the  unstructured 
grid.  VI  is  calculated  as  the  volume  of  the  tetrahedra  formed  by  the  given  point,  P2,  P3,  and  P4. 
If  this  volume  is  positive,  then  the  given  point  is  on  the  inside  of  the  face  formed  by  P2,  P3,  and 
P4.  If  it  is  negative,  then  the  given  point  is  on  the  outside  of  the  face  formed  by  P2,  P3,  and  P4. 
If  it  is  zero,  then  the  given  point  lies  on  the  face  formed  by  P2,  P3,  and  P4.  V2  is  calculated  as 
the  volume  of  the  tetrahedra  formed  by  the  given  point,  PI,  P4,  and  P3.  If  V2  is  positive,  then 
the  given  point  lies  on  the  inside  of  the  face  formed  by  PI ,  P4,  and  P3.  If  V2  is  negative,  then  the 
given  point  lies  on  the  outside  of  the  face  formed  by  PI ,  P4,  and  P3.  Similar  conditions  apply  to 
the  computation  of  subvolumes  V3,  and  V4. 

7.2.2.2  Searching  Algorithm  - 

The  base  searching  algorithm  can  be  stated  in  the  following  steps,  and  is  based  on  the 

searching  algorithm  given  in  [50]: 

1 .  If  a  good  starting  element  is  needed,  find  one  using  the  volume  chunking  structure. 

2.  Recursively  search  for  element  containment. 

3.  If  the  recursive  search  fails  to  find  an  element,  then  cycle  over  all  elements  in  the  volume 
chunk  containing  the  given  point  that  have  not  previously  been  visited. 

The  recursive  algorithm  is  given  the  point,  the  current  element  in  the  traversal,  an  array  of 
flags  indicating  whether  an  element  has  been  visited,  and  the  current  visit.  This  algorithm  can 


be  stated  as: 
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V  =  Calculate  Volume  (PI,  P2,  P3,  P4) 

VI  =  Calculate  Volume  (P,  P2,  P3,  P4) 
V2  =  Calculate  Volume  (P,  PI,  P4,  P3) 
V3  =  Calculate  Volume  (P,  P4,  PI,  P2) 
V4  =  Calculate  Volume  (P,  P3,  P2,  PI) 


if  (VI  <  0.0) 

TRAVEL  OUT  FACE  2 
(P2,  P3,  P4) 


Figure  7.13:  Volume  Coordinates 

1 .  Set  the  visited  flag  for  the  current  element  e  to  the  current  visit. 

2.  Calculate  whether  the  current  element  contains  the  given  point  by  computing  subvolumes 
VI,  V2,  V3,andV4. 

3.  If  V 1  >  0  and  V2  >  0  and  V3  >  0  and  V4  >  0,  then  the  current  element  contains  the  given 
point.  Return  successfully  with  the  current  element. 

4.  If  VI  <  0  or  V2  <  0  or  V3  <  0  or  V4  <  0, 

•  If  the  volume  coordinate  V 1  is  negative  and  element  neighbor  2  is  valid  and  not 
visited,  then  set  the  current  element  to  element  neighbor  2,  and  call  the  recursive 
search  again. 

•  If  the  volume  coordinate  V2  is  negative  and  element  neighbor  3  is  valid  and  not 
visited,  then  set  the  current  element  to  element  neighbor  3,  and  call  the  recursive 
search  again. 

•  If  the  volume  coordinate  V3  is  negative  and  element  neighbor  4  is  valid  and  not 
visited,  then  set  the  current  element  to  element  neighbor  4,  and  call  the  recursive 
search  again. 

•  If  the  volume  coordinate  V4  is  negative  and  element  neighbor  1  is  valid  and  not 
visited,  then  set  the  current  element  to  element  neighbor  1 ,  and  call  the  recursive 
search  again. 
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Finding  an  initial  starting  element  using  the  volume  chunking  structure  operates  in  0(1) 
time.  Given  a  good  starting  element,  the  recursive  search  to  find  the  containing  element  operates 
by  navigating  through  the  number  of  elements  between  the  starting  element  and  the  containing 
element,  provided  no  embedded  boundary  is  encountered.  This  would  operate  in  a  worst  case 
time  of  0(N 1/3),  where  N  is  the  number  of  elements  in  the  input  grid.  This  worst  case  estimate 
is  based  on  a  rectilinear  three-dimensional  volume  that  contains  N  elements.  The  diagonal 
through  the  volume  is  then  given  as  0(N 1/3).  If  the  recursive  search  should  fail,  then  the  search 
routine  would  operate  in  a  worst  case  time  of  0(nV),  where  nV  is  the  number  of  elements 
contained  in  volume  index  V.  Implementations  of  the  searching  algorithms  are  given  in  Appendix 
F. 

As  is  shown  in  the  algorithm  above,  an  element  containment  test  is  performed.  If  the  given 
point  is  not  contained  within  the  candidate  element,  then  the  resulting  volume  coordinates  are 
used  to  determine  the  direction  of  traversal.  Completing  the  discussion  of  the  searching  requires 
an  explanation  of  the  computation  of  volume  coordinates.  This  computation  calculates  the  sub 
volumes  VI,  V2,  V3,  and  V4. 


7.3  Cutting  Planes 

Cutting  planes  are  perhaps  the  most  common  method  for  extracting  information  from  a 
volume  solution.  Displayed  properly,  it  can  give  a  variety  of  meanings,  from  the  shape  and 
structure  of  the  grid  at  that  plane,  to  the  behavior  of  the  solution  in  that  plane.  It  is  a  widely 
accepted  method  for  querying  the  physical  properties  of  a  solution,  however,  it  can  also  be  one 
of  the  most  costly.  If  the  original  volume  grid  and  solution  are  significantly  large,  an  arbitrarily 
placed  cutting  plane  can  potentially  be  very  large  as  well.  Because  it  is  very  widely  used, 
and  because  it  has  the  potential  for  displaying  a  great  deal  of  information  at  one  time,  it  is 
important  to  make  sure  that  the  cutting  planes  are  generated  optimally  with  respect  to  space 
and  quickly  with  respect  to  CPU  performance.  The  following  sections  present  a  new  cutting 
plane  algorithm  that  generates  a  cutting  plane  with  no  duplication  of  intersection  points  and  by 
avoiding  redundant  element  visits  in  the  grid.  As  the  cut  is  generated,  the  knowledge  of  where  the 
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cut  has  already  occurred  is  retained  to  avoid  unnecessary  intersection  calculations  and  to  avoid 
duplicating  memory  by  storing  redundant  point  information.  This  method  can  handle  multiple 
disjoint  grids,  and  it  can  handle  all  previously  described  element  types.  The  cutting  plane  is 
specified  by  a  single  point  and  a  single  normal.  The  algorithm  proceeds  as  follows: 


1.  Initialize  all  global  structures  and  arrays  needed  to  mark  visits  to  elements. 

2.  Find  an  element  e  containing  a  point  that  lies  in  the  cutting  plane. 

3.  Intersect  the  plane  with  the  element  and  record  the  intersection  points  and  the  triangles. 
At  this  point,  we  do  not  have  to  worry  about  duplicating  points  because  the  intersection 
routine  returns  a  copy  of  the  points  listed  exactly  once.  The  intersection  routine  also 
returns  information  giving  which  element  faces  have  been  intersected. 

4.  Set  this  element  e  to  visited. 

5.  Call  the  recursive  cutting  plane  algorithm  to  grow  the  cutting  plane  out  from  element  e. 
The  recursive  cutting  plane  algorithm  can  be  stated  as: 


1 .  If  face  1  of  e  has  been  intersected  and  the  neighbor  attached  to  face  1  is  valid  and  has  not 
been  visited,  then 

(a)  Calculate  intersection  parameters  to  determine  whether  element  neighbor  1  intersects 
the  plane.  If  the  element  intersects  the  plane, 

(b)  Find  the  intersection  points  and  intersecting  faces  of  the  plane  with  element  neighbor 

1. 

(c)  Record  all  non-duplicate  points. 

(d)  Record  all  triangles. 

(e)  Set  element  neighbor  1  to  visited. 

(f)  Call  the  recursive  cutting  plane  algorithm  to  grow  the  cutting  plane  out  from  element 
neighbor  1 . 

2.  If  face  2  of  e  has  been  intersected  and  the  neighbor  attached  to  face  2  is  valid  and  has  not 
been  visited,  then 

(a)  Calculate  intersection  parameters  to  determine  whether  element  neighbor  2  intersects 
the  plane.  If  the  element  intersects  the  plane, 

(b)  Find  the  intersection  points  and  intersecting  faces  of  the  plane  with  element  neighbor 

2. 

(c)  Record  all  non-duplicate  points. 
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(d)  Record  all  triangles. 

(e)  Set  element  neighbor  2  to  visited. 

(f)  Call  the  recursive  cutting  plane  algorithm  to  grow  the  cutting  plane  out  from  element 
neighbor  2. 

3.  If  face  3  of  e  has  been  intersected  and  the  neighbor  attached  to  face  3  is  valid  and  has  not 
been  visited,  then 

(a)  Calculate  intersection  parameters  to  determine  whether  element  neighbor  3  intersects 
the  plane.  If  the  element  intersects  the  plane, 

(b)  Find  the  intersection  points  and  intersecting  faces  of  the  plane  with  element  neighbor 

3. 

(c)  Record  all  non-duplicate  points. 

(d)  Record  all  triangles. 

(e)  Set  element  neighbor  3  to  visited. 

(f)  Call  the  recursive  cutting  plane  algorithm  to  grow  the  cutting  plane  out  from  element 
neighbor  3. 

4.  If  face  4  of  e  has  been  intersected  and  the  neighbor  attached  to  face  4  is  valid  and  has  not 
been  visited,  then 

(a)  Calculate  intersection  parameters  to  determine  whether  element  neighbor  4  intersects 
the  plane.  If  the  element  intersects  the  plane, 

(b)  Find  the  intersection  points  and  intersecting  faces  of  the  plane  with  element  neighbor 

4. 

(c)  Record  all  non-duplicate  points. 

(d)  F :  cord  all  triangles. 

(e)  Set  element  neighbor  4  to  visited. 

(f)  Call  the  recursive  cutting  plane  algorithm  to  grow  the  cutting  plane  out  from  element 
neighbor  4. 

This  cutting  plane  algorithm  produces  a  plane  that  has  a  minimal  representation  of  points 
given  the  resulting  triangulation.  The  algorithm  operates  in  worst  case  0(P )  time,  where  P  is 
the  total  number  of  elements  in  each  volume  chunk  that  is  intersected  by  the  cutting  plane.  An 
implementation  of  the  cutting  plane  algorithm  is  given  in  Appendix  G. 

An  example  of  how  the  algorithm  travels  from  one  face  to  the  next  is  given  in  Figure  7.14. 
The  tetrahedra,  El ,  formed  from  the  points  PI ,  P2,  P3,  and  P4  shares  a  face  with  the  tetrahedra, 
E2,  formed  from  the  points  PI,  P2,  P3,  and  P5.  El  is  intersected  with  the  cutting  plane  resulting 
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in  the  intersection  points  II ,  12, 13,  and  14.  The  intersection  routine  flags  face  1  as  having  been  cut 
and  the  traversal  carries  out  through  face  1  into  element  E2.  E2  returns  an  additional  intersection 
point  of  15,  recognizing  the  duplicates  II  and  12. 


11,12, 13, 14 
15 


P5 

Figure  7.14:  An  Example  of  the  Cutting  of  Neighboring  Elements 

7.4  Isosurfaces 

Although  this  research  is  not  specifically  concerned  with  data  structures  for  explicit 
isosurface  creation,  the  methods  used  to  create  the  cutting  plane  with  no  duplicate  points  can 
be  extended  to  handle  isosurface  creation  with  no  duplicate  points.  The  creation  of  an  isosurface 
is  quite  similar  to  the  creation  of  a  cutting  plane.  The  volume  chunking  can  also  be  extended 
to  contains  elements  in  chunks  based  on  function  values  as  opposed  to  spatial  location.  The 
memory  overhead  and  timings  are  exactly  the  same  as  that  for  volume  chunking  and  the  cutting 
plane  algorithms  already  presented. 


7.5  Summary 

A  set  of  data  structures  and  algorithms  has  been  presented  which  operate  well  on  large  scale 
unstructured  scientific  data.  These  algorithms  and  data  structures  are  implemented  within  the 
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confines  of  a  framework  called  the  Compute  Server.  It  is  now  prudent  to  summarize  the  analysis 
of  the  new  algorithms  that  have  been  presented  and  to  provide  performance  examples. 

A  new  method  for  searching  through  an  unstructured  volume  has  been  presented.  When  the 
input  data  is  brought  into  the  system,  an  additional  structure  is  imposed  on  this  data  called  a 
volume  chunking  structure.  This  structure  is  used  to  avoid  an  exhaustive  search  of  the  volume 
when  search  methods  fail  due  to  the  presence  of  embedded  boundaries.  The  search  method 
consults  the  volume  chunking  structure  only  when  a  good  candidate  element  is  needed  and 
not  already  known.  Once  the  volume  chunking  structure  returns  a  good  candidate  element, 
the  search  navigates  through  the  volume  using  a  local  search.  If  the  local  search  fails  due  to 
an  embedded  boundary  in  the  volume  chunk  in  which  the  point  is  contained,  the  search  will 
traverse  brute  force  through  all  elements  in  the  selected  volume  chunk.  The  overhead  for  this 
volume  chunking  structure  is  less  than  18%.  The  overhead  is  computed  by  dividing  the  amount 
of  memory  needed  for  the  volume  chunking  structure  by  the  memory  that  is  allocated  for  the 
input  data.  The  worst  case  performance  for  a  search  is  0(V)y  where  V  is  the  number  of. 
elements  contained  in  the  volume  chunk  selected.  V  is  typically  much  less  than  the  number  of 
elements  in  the  entire  volume,  thus  improving  significantly  the  exhaustive  search  needed  when  an 
embedded  boundary  is  encountered.  An  average  search  time  for  the  new  method  is  0(N 1/3).  As 
a  comparison,  a  brute  force  method  for  navigating  through  the  volume  results  in  a  time  estimate 
of  0(N ),  where  N  is  the  number  of  elements  in  the  volume.  It  has  no  overhead.  A  local  area 
search  with  an  element  neighbor  map,  like  that  used  in  Field  Encapsulation  Library  [47],  has 
an  average  traversal  of  0(NX^3)  and  a  worst  case  traversal  time  of  0(N ),  with  no  overhead. 
The  octrees  presented  by  Wilhelms  and  van  Gelder  [54]  have  an  impressive  search  time  of 
0(I<  +  Klog(N/K))y  where  K  is  the  number  of  active  elements  and  N  is  the  total  number 
of  elements  in  the  volume.  However,  the  overhead  is  approximately  20%,  and  the  method  is 
unable  to  handle  embedded  boundaries  and  unstructured  data  sets.  A  method  using  k-d  trees  by 
Livnat,  et  al.  [55],  has  a  search  time  of  0(y/N  +  K),  where  K  is  the  number  of  active  elements 
and  N  is  the  total  number  of  elements  in  the  volume.  An  extension  to  the  k-d  trees,  interval  trees, 
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presented  by  Cignoni,  et  al.  [57],  has  a  search  time  of  0(log  N  +  Ii),  where  K  is  the  number  of 
active  elements  The  overhead  for  both  of  these  structures  is  >  25%  and  can  be  much  larger  than 
25%.  An  example  of  three  different  size  data  sets  is  shown  in  Table  7.2.  The  Fighter  is  a  swept 
wing  fighter  notional  geometry  from  McDonnel  Douglas.  The  X-38  is  a  geometry  for  the  NASA 
escape  pod  for  the  international  space  station,  and  the  Minita  is  a  notional  tilt-rotor  geometry. 


Table  7.2:  Example  Data  Sets  For  Timing  Algorithms 


Fighter 

Nodes 

64,924 

Triangles 

24,322 

Tetrahedra 

349,018 

Pyramids 

0 

Prisms 

0 

X-38 

335,274 

41,786 

1,943,483 

0 

0 

Minita 

4,806,397 

462,872 

7,439,997 

34,517 

6,875,063 

Timing  results  for  the  modified  searching  algorithm  discussed  above  are  shown  in  Table 
7.3.  All  timings  were  performed  on  a  Silicon  Graphics  Octane  MXE  with  a  300  MHz  R12000 
processor  and  2  GBytes  of  main  memory.  Three  separate  methods  for  searching  were  timed.  The 
method  labeled  Brute  Force  indicates  an  exhaustive  search  through  the  entire  volume  given  an 
instance  where  an  embedded  boundary  becomes  an  obstacle.  The  column  labeled  Traditional 
Searching  reveals  timings  for  a  method  that  starts  with  element  0  and  proceeds  to  navigate 
through  the  volume  with  local  search  techniques  until  an  embedded  ocindary  is  encountered. 
The  method  then  performs  an  exhaustive  search  through  all  elements  in  the  volume  that  have 
not  previously  been  visited.  The  column  labeled  Modified  Searching  shows  the  timings  for  the 
new  searching  algorithm  that  has  been  developed  as  a  result  of  this  research.  These  timings 
are  encouraging.  The  modified  searching  technique  appears  to  remain  relatively  constant  in  the 
amount  of  time  it  requires  to  find  the  desired  element  regardless  of  the  size  of  the  volume.  Both 
brute  force  and  traditional  searching  techniques  show  a  significant  increase  in  the  amount  of  time 
required  when  the  size  of  the  data  increases  dramatically.  The  overhead  also  appears  to  become 
less  significant  as  the  size  of  the  data  grows. 
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Table  7.3:  Timing  Results  For  Modified  Searching  Algorithm  (in  CPU  seconds) 


Brute  Force 
Searching 

Traditional 

Searching 

Modified 

Searching 

Size  of 

Grid 

Size  of 
Overhead 

% 

Overhead 

Fighter 

8.83  MB 

1.40  MB 

15.8 

X-38 

3.49 

3.25 

0.024 

47.42  MB  - 

7.78  MB 

16.4  , 

Minita 

46.96 

13.33 

463.04  MB 

57.40  MB 

12.4 

A  new  cutting  plane  algorithm  has  been  presented  that  creates  an  arbitrary  cutting  plane  with 
an  optimal  number  of  points  given  the  default  triangulation.  This  default  triangulation  refers 
to  the  set  of  triangles  that  result  from  cutting  the  elements  without  any  compression  techniques 
applied  to  the  triangulation.  The  new  method  for  calculating  an  arbitrary  cutting  plane  operates 
in  O(P)  time,  where  P  is  the  number  of  elements  contained  in  each  of  the  volume  chunks  that 
intersect  the  cutting  plane.  Given  a  judicious  choice  for  the  number  of  volume  chunks,  Vy  P 
can  be  seen  to  be  0(N2/3).  It  can  also  be  shown  that  eliminating  duplicate  points  from  the 
cutting  plane  will  result  in  a  maximum  of  0(6)  compression.  Because  we  are  dealing  with 
volumes  generated  for  the  purposes  of  computational  fluid  dynamics,  we  can  be  assured  that 
there  exists  reasonable  quality  in  the  grids.  This  reasonable  quality  equates  to  an  arbitrary  cut 
that  has  an  average  of  6  elements  surrounding  each  point.  This  is  because  the  distributions  of 
the  angles  in  the  surface  grid  are  between  50°  to  70°.  An  average  angle  of  60°  results  in  6 
elements  surrounding  a  given  point.  With  an  average  of  6  elements  surrounding  each  point,  we 
can  easily  see  that  each  point  could  be  represented  6  times  for  each  triangle.  With  elimination 
of  duplicate  points,  we  can  see  that  a  maximum  compression  of  0(6)  is  achieved.  FEPLOT3D 
[61]  operates  in  O(N)  time  and  has  a  duplicate  point  representation.  The  cutting  plane  algorithm 
in  FAST  [48]  operates  in  a  similar  O(N)  time  with  duplicate  points.  The  Field  Encapsulation 
Library  [47]  operates  in  0(N2^3)  time  with  a  duplication  of  points.  Timing  results  for  the 
modified  cutting  plane  algorithm  discussed  above  are  shown  in  Table  7.4.  These  timings  show 
a  speed  performance  improvement  of  3  to  4  times  over  traditional  cutting  plane  algorithms.  The 
traditional  cutting  plane  algorithms  refer  to  those  that  visit  every  element  in  the  volume  and  do 
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not  exploit  coherency  through  the  use  of  an  element  neighbor  map.  As  expected  through  the 
complexity  analysis,  the  compression  is  seen  to  be  in  the  range  of  5  to  6  times  that  of  traditional 
methods.  Again,  these  methods  appear  to  scale  well  with  the  problem  size. 


Table  7.4:  Timing  Results  For  Modified  Cutting  Plane  Algorithm  (in  CPU  seconds) 


Traditional 

Cutting 

Modified 

Cutting 

Speedup 

Compression 

Fighter 

0.68 

25422  Points 

0.18 

5062  Points 

3.78X 

5.02X 

X-38 

3.59 

117798  Points 

0.87 

21253  Points 

4.13X 

5.54X 

Minita 

29.09 

353721  Points 

6.96 

661 16  Points 

4.18X 

; 

5.35X 

Analysis  of  these  data  structures  and  algorithms  has  been  presented,  and  a  justification  for 
the  means  of  implementation  has  been  discussed.  The  next  chapter  provides  a  summary  of  the 
information  that  has  been  presented  in  this  research. 


CHAPTER  VIII 


RESULTS  AND  CONCLUSIONS 

8.1  Results 

The  results  of  this  research  are  culminated  in  the  development  of  a  set  of  algorithms  and 
the  generation  of  an  animation  depicting  the  separation  of  a  single  booster  from  the  delta  II 
configuration.  What  follows  is  an  overview  of  the  animation  procedures  that  were  used  to 
generate  an  unsteady  animation  of  this  event  and  an  explanation  of  each  segment  of  the  resulting 
movie. 


8.1.1  Animation  Procedures 

Each  frame  of  the  animation  was  generated  through  a  batch  version  of  the  DIVA  software 
and  was  rendered  using  a  modified  version  of  POVRAY  [62].  The  frames  were  generated  on  both 
a  Sun  Enterprise  10000  (El  0000)  with  64  processors  each  containing  2GB  of  main  memory  and 
a  Sun  Cluster  with  64  processors  each  containing  2  GB  of  main  memory.  The  segments  were 
produced  independent  of  each  other  and  were  run  in  parallel  on  the  El  0000  or  the  Cluster.  The 
solution  originally  produced  1180  data  sets,  however  the  animation  was  generated  by  ss::.4>trng 
every  fourth  data  set  resulting  in  297  frames  for  the  movie.  The  movie  contains  4  segments  of 
unsteady  information  and  3  segments  of  static  information.  The  4  unsteady  segments  contain 
297  frames  each,  making  the  total  animation  contain  1191  frames.  Each  frame  was  ray  traced 
on  either  the  El 0000  or  the  Cluster  and  required  approximately  5  minutes  -  1  hour  to  render. 
The  unsteady  segments  were  run  in  parallel  on  either  the  E10000  or  the  Sun  Cluster  using  32 
processors  at  a  time. 
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8.1.2  Storyboard  for  the  movie 

An  initial  storyboard  was  planned  out  to  try  to  depict  the  movement  and  behavior  of  the 
tumbling  booster  over  the  297  time  steps.  The  final  animation  was  divided  into  eight  segments, 
and  was  designed  to  portray  the  behavior  and  the  movement  of  the  tumbling  booster  in  the  most 
informative  manner.  Each  of  the  individual  sections  is  described  in  the  following  paragraphs. 

8. 1.2.1  Full  delta  II  viscous  configuration 

A  viscous  solution  of  the  full  delta  II  configuration  was  generated  to  serve  as  a  frame  of 
reference  for  the  problem.  Initially,  the  animation  begins  with  a  view  of  the  delta  II  with  contours 
of  density  from  a  viscous  solution  plotted  on  the  body  itself.  The  image  that  is  in  the  movie  is 
shown  in  Figure  8.1. 

8.1 .2.2  Close-Up  of  booster  separating  from  full  delta  II  viscous  configuration 

The  viewer  is  then  shown  a  close  up  of  on  the  boosters  that  is  beginning  to  separate  from  the 
full  configuration.  This  image  is  shown  in  Figure  8.2. 

8.1 .2.3  Overall  tumbling  trajectory  of  booster  separation 

An  overall  view  of  the  trajectory  of  the  tumbling  booster  is  shown  in  Figure  8.3.  This  is 
given  to  orient  the  viewer  to  the  overall  motion  and  path  that  the  booster  travels  from  separation 
to  the  end  of  the  simulation.  The  initial  frame  is  at  the  top  of  the  figure  and  the  final  frame  is 
given  at  the  bottom  right  of  the  figure.  The  figure  contains  99  of  the  frames  that  were  used  to 
generate  the  animation. 

8.1 .2.4  Animation  of  density  contour  on  normal  cut  from  rigid  pole  camera  view  point 

The  first  unsteady  sequence  of  the  movie  shows  density  contours  on  a  cutting  plane  normal 

to  the  booster.  During  the  sequence,  each  plane  is  calculated  in  the  same  location  relative  to  the 
booster  as  it  is  tumbling.  The  camera  is  placed  at  an  initial  position  some  distance  away  in  the  z 
direction.  During  the  animation,  the  camera  remains  attached  to  the  body  as  if  it  were  attached 
by  a  rigid  pole.  When  the  booster  rolls,  the  camera  rolls  with  it.  Figure  8.4  shows  the  initial 
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frame  of  the  animation  at  time  t=0s.  The  density  contours  are  plotted  with  a  minimum  value  of 
0.0  and  a  maximum  value  of  1 .7.  The  image  shown  is  Figure  8.5  shows  the  booster  in  this  same 
animation  sequence  at  time  t=21.8s.  The  image  shown  is  Figure  8.6  shows  the  booster  in  this 
same  animation  sequence  at  time  t=29.5s. 

8.1 .2.5  Animation  of  density  contour  on  normal  cut  from  dynamic  xy  camera  view  point 

The  second  unsteady  sequence  of  the  movie  shows  density  contours  on  a  cutting  plane 

normal  to  the  booster.  During  the  sequence,  each  plane  is  calculated  in  the  same  location  relative 
to  the  booster  as  it  is  tumbling.  The  camera  is  placed  at  an  initial  position  some  distance  away 
in  the  z  direction.  During  the  animation,  the  camera  remains  moves  in  the  x  and  y  directions 
with  the  body,  but  remains  fixed  in  the  z.  This  means  that  the  distance  in  x  and  y  that  the  booster 
travels,  the  camera  does  also.  However,  if  the  booster  rolls  toward  the  camera  (moves  toward 
that  camera),  then  the  booster  appears  to  get  larger.  If  the  booster  rolls  away  from  the  camera, 
then  the  booster  appears  to  get  smaller.  Figure  8.7  shows  the  initial  frame  of  the  animation  at 
time  t=0s.  The  density  contours  are  plotted  with  a  minimum  value  of  0.0  and  a  maximum  value 
of  1.7.  The  image  shown  is  Figure  8.8  shows  the  booster  in  this  same  animation  sequence  at 
time  t=21 .8s.  The  image  shown  is  Figure  8.9  shows  the  booster  in  this  same  animation  sequence 
at  time  t=29.5s. 

8.1 .2.6  Animation  of  density  contours  on  two  normal  cut  from  dynamic  xy  camera  view  point 

The  third  unsteady  sequence  of  the  movie  shows  density  contours  on  two  cutting  planes 

normal  to  the  booster.  During  the  sequence,  each  plane  is  calculated  in  the  same  location  relative 
to  the  booster  as  it  is  tumbling.  The  camera  is  placed  at  an  initial  position  some  distance  away 
in  the  z  direction.  During  the  animation,  the  camera  remains  moves  in  the  x  and  y  directions 
with  the  body,  but  remains  fixed  in  the  z.  The  cuts  are  made  opaque  or  transparent  based  on  the 
dot  product  of  the  view  vector  with  the  normal  to  the  cutting  plane.  The  view  vector  is  defined 
as  the  vector  that  is  generated  from  the  position  that  the  camera  is  looking  at  minus  the  position 
on  which  the  camera  is  physically  located.  The  cuts  are  rendered  completely  transparent  if  the 
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normal  to  the  cut  is  orthogonal  to  the  view  vector  (the  dot  product  of  the  normal  to  the  cut  and 
the  view  vector  is  0.0).  The  cut  is  rendered  as  completely  opaque  if  the  dot  product  of  the  normal 
to  the  cut  with  the  view  vector  is  1 .0.  A  linear  variation  is  used  for  any  value  between  0.0  and 
1 .0.  This  sequence  allows  the  user  to  always  have  a  full  view  of  the  solution  parameters  while 
maintaining  the  view  of  the  full  motion.  The  first  unsteady  sequence  allowed  for  a  full  view 
of  the  solution,  but  did  not  allow  for  an  accurate  view  of  the  full  motion.  The  second  sequence 
sacrificed  the  full  view  of  the  solution  to  allow  the  user  to  see  an  accurate  view  of  the  full  motion. 
Figure  8.10  shows  the  initial  frame  of  the  animation  at  time  t=0s.  Again,  the  density  contours  are 
plotted  with  a  minimum  value  of  0.0  and  a  maximum  value  of  1.7.  The  image  shown  is  Figure 
8.11  shows  the  booster  in  this  same  animation  sequence  at  time  t=21.8s.  The  image  shown  is 
Figure  8.12  shows  the  booster  in  this  same  animation  sequence  at  time  t=29.5s. 

8.1 .2.7  Animation  of  isosurfaces  of  two  density  values  from  dynamic  xy  camera  view  point 

The  final  unsteady  sequence  of  the  movie  shows  time  varying  isosurfaces  on  the  volume 
solution  of  the  tumbling  booster  that  grow  and  shrink  over  time.  The  isosurfaces  are  generated 
from  density  values  at  0.45  and  1.35.  The  isosurface  shown  in  magenta  is  density=1.35  and 
the  isosurface  shown  in  blue  is  density=0.45.  The  camera  is  placed  at  an  initial  position  some 
distance  away  in  the  z  direction.  During  the  animation,  the  camera  remains  moves  in  the  x  and 
y  directions  with  the  body,  but  remains  fixed  in  the  z.  Figure  8.13  shows  the  initial  frame  of 
the  animation  at  time  t=0s.  The  image  shown  is  Figure  8.14  shows  the  booster  in  this  same 
animation  sequence  at  time  t=21.8s.  The  image  shown  is  Figure  8.15  shows  the  booster  in  this 
same  animation  sequence  at  time  t=29.5s. 

8.2  Conclusions 

Visualizing  the  results  from  large  scale  unstructured  unsteady  simulations  is  an  interesting 
and  creative  process.  Several  areas  of  research  were  brought  together  to  develop  both  the 
algorithms  and  the  resulting  animation.  These  areas  include  grid  generation  for  both  static  and 
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dynamically  changing  grids,  parallel  unsteady  solution  methods,  and  visualization  techniques 
for  extracting  and  rendering  high  quality  animations. 

Fortunately,  the  results  of  this  research,  were  developed  as  a  result  of  working  closely 
in  a  multi-disciplinary  team  environment.  This  has  enabled  individuals  from  a  variety  of 
backgrounds  to  have  not  only  theoretical  input,  but  to  have  technical  say  in  the  outcome  of 
the  present  work. 


Figure  8.1 :  Viscous  solution  of  the  Delta  II  and  the  boosters 


Figure  8.2:  Viscous  solution  of  one  of  the  separating  boosters  from  the  Deltall 
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Figure  8.3:  Overall  trajectory  of  booster  tumbling  from  a  fixed  viewpoint 


Figure  8.4:  Density  contour  on  cut  through  booster  with  rigid  pole  view  at  time  t=0s. 


Figure  8.5:  Density  contour  on  cut  through  booster  with  rigid  pole  view  at  time  t=21.8s. 
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Figure  8.6:  Density  contour  on  cut  through  booster  with  rigid  pole  view  at  time  t=29.5s. 
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Figure  8.7:  Contour  on  cut  through  booster  with  dynamic  x,y  static  z  view  at  time  t=0.0s 


Figure  8.8:  Contour  on  cut  through  booster  with  dynamic  x,y  static  z  view  at  time  t=21 .8s 


Figure  8.9:  Contour  on  cut  through  booster  with  dynamic  x,y  static  z  view  at  time  t=29.5s 


Figure  8.10:  Contour  on  cuts  through  booster  with  dynamic  x,y  static  z  view  at  time  t=0.0s 


Figure  8.11:  Contour  on  cuts  through  booster  with  dynamic  x,y  static  z  view  at  time  t=21 .8s 


Figure  8.12:  Contour  on  cuts  through  booster  with  dynamic  x,y  static  z  view  at  time  t=29.5s 


Figure  8.13:  Isosurfaces  of  density  on  booster  with  dynamic  x,y  static  z  view  at  time  t=0.0s 


Figure  8.14:  Isosurfaces  of  density  on  booster  with  dynamic  x,y  static  z  view  at  time  t=21.8s 


Figure  8.15:  Isosurfaces  of  density  on  booster  with  dynamic  x,y  static  z  view  at  time  t=29.5s 
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The  DIVA  visualization  system  is  capable  of  handling  structured  multi-block,  unstructured 
tetrahedral,  and  unstructured  mixed  element  grids.  The  file  readers  associated  with  the  DIVA 
visualization  system  key  off  of  a  suffix  appended  to  the  end  of  the  grid  file  name.  A  structured 
multi-block  grid  has  the  suffix  .sgrid,  an  unstructured  tetrahedral  grid  has  the  suffix  .fgrid,  and 
an  unstructured  mixed  element  grid  has  the  suffix  .ugrid. 

Structured  Multi-block 

The  format  for  a  structured  multi-block  grid  is  as  follows: 

NumBlocks 

for  i  =  1  to  NumBlocks,  read  in  NumI,  NumJ,  NumK  Dimensions  for  Block  i 
for  i  =  1  to  NumBlocks,  for  j  =  1  to  NumI*NumJ*NumK,  read  X,Y,Z 


Unstructured  Tetrahedral 

The  format  for  an  unstructured  tetrahedral  grid  is  as  follows: 

NumNodes,  NumSurfTriangles,  NumVolTets 

for  i  =  1  to  NumNodes,  read  in  all  X  Coordinates 

for  i  =  1  to  NumNodes,  read  in  all  Y  Coordinates 

for  i  =  1  to  NumNodes,  read  in  all  Z  Coordinates 

for  i  =  1  to  NumSurfTriangles,  read  in  three  indices  into  coordinates 

for  i  =  1  to  NumSurfTriangles,  read  in  boundary  condition  for  triangle  i 

for  i  =  1  to  NumVolTets,  read  in  four  indices  into  coordinates 
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Unstructured  Mixed  Element 

The  format  for  an  unstructured  mixed  element  grid  is  as  follows: 

NumNodes,  NumSurfTriangles,  NumSurfQuads, 

NumVolTets,  NumVolPents5,  NumVolPents6,  NumVolHexs 

for  i  =  1  to  NumNodes,  read  in  all  X  Coordinates 

for  i  =  1  to  NumNodes,  read  in  all  Y  Coordinates 

for  i  =  1  to  NumNodes,  read  in  all  Z  Coordinates 

for  i  =  1  to  NumSurfTriangles,  read  in  three  indices  into  coordinates 

for  i  =  1  to  NumSurfQuads,  read  in  four  indices  into  coordinates 

for  i  =  1  to  NumSurfTriangles+NumSurfQuads,  read  in  boundary  condition  for  surface  i 

for  i  =  1  to  NumVolTets,  read  in  four  indices  into  coordinates 

for  i  =  1  to  NumVolPents'5,  read  in  five  indices  into  coordinates 

for  i  =  1  to  NumVolPents6,  read  in  six  indices  into  coordinates 

for  i  =  1  to  NumVolHexs,  read  in  eight  indices  into  coordinates 
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The  DIVA  visualization  system  is  capable  of  handling  the  standard  Q  file,  the  incompressible 
Q  file,  and  the  function  file  formats  for  solution  input.  The  file  readers  associated  with  the 
DIVA  visualization  system  key  off  of  a  suffix  appended  to  the  end  of  the  solution  file  name.  A 
structured  multi-block  standard  Q  file  has  the  suffix  .sflow,  an  unstructured  tetrahedral  standard 
Q  file  has  the  suffix  .fflow,  and  an  unstructured  n.l^ed  element  standard  Q  file  has  the  suffix 
.uflow.  A  structured  multi-block  incompressible  Q  file  has  the  suffix  .zflow.  A  structured  multi¬ 
block  function  file  has  the  suffix  .sfunc,  an  unstructured  tetrahedral  function  file  has  the  suffix 
.ffunc  or  .unfunc. 

Structured  Multi-block  Standard  Q  File 

The  format  for  a  structured  multi-block  standard  Q  file  is  as  follows: 

NumB  locks 

for  i  =  1  to  NumBlocks,  read  in  NumI,  NumJ,  NumK  Dimensions  for  Block  i 
for  i  =  1  to  NumBlocks, 

read  four  doubles  RefMach,  Alpha,  RefReynolds,  Time 

for  j  =  1  to  NumI*NumJ*NumK, 

read  all  block  i’s  density  values 

read  all  block  i’s  density*u  velocity  values 

read  all  block  i’s  density*v  velocity  values 

read  all  block  i’s  density*w  velocity  values 

read  all  block  i’s  energy  values 


Unstructured  Standard  Q  File 

The  format  for  a  unstructured  standard  Q  file  is  as  follows: 

NumNodes,  NumJNodes,  NumKNodes 

read  four  doubles  RefMach,  Alpha,  RefReynolds,  Time 

for  i  =  1  to  NumNodes, 

read  all  node  i’s  density  values 

read  all  node  i’s  density*u  velocity  values 

read  all  node  i’s  density*v  velocity  values 

read  all  node  i’s  density*w  velocity  values 

read  all  node  i’s  energy  values 


Structured  Multi-block  Incompressible  Q  File 

The  format  for  a  structured  multi-block  incompressible  Q  file  is  as  follows: 
NumBlocks 

for  i  =  1  to  NumBlocks,  read  in  NumI,  NumJ,  NumK  Dimensions  for  Block 
for  i  =  1  to  NumBlocks, 

read  four  doubles  RefMach,  Alpha,  RefReynolds,  Time 

for  j  =  1  to  NumI*NumJ*NumK, 

read  all  block  i’s  Q1  values 

read  all  block  i’s  u  velocity  values 

read  all  block  i’s  v  velocity  values 

read  all  block  i’s  w  velocity  values 

read  all  block  i’s  Q5  values 


Structured  Multi-block  Function  File 

The  format  for  a  structured  multi -block  function  file  is  as  follows: 
NumFunctions,NumBlocks 

for  i  =  1  to  NumFunctions,  read  in  whether  function  is  scalar  1  or  vector  0 

for  i  =  1  to  NumBlncks,  read  in  NumI,  NumJ,  NumK  Dimensions  for  Block  i 

for  i  =  1  to  NumFunctions, 

for  j  =  1  to  NumBlocks, 

for  k  =  1  to  NumI*NumJ*NumK  for  block  j, 

if  the  function  is  a  vector,  read  three  values 

if  the  function  is  a  scalar,  read  one  value 


Unstructured  Function  File 

The  format  for  an  unstructured  function  file  is  as  follows: 

NumFunctions, 

for  i  =  1  to  NumFunctions, 

read  NumFunctionNodes,  NumJNodes,  NumKNodes,  FunctionFlag  for  function 

for  i  =  1  to  NumBlocks,  read  in  NumI,  NumJ,  NumK  Dimensions  for  Block  i 

for  i  =  1  to  NumFunctions, 

for  j  =  1  to  NumFunctionNodes  for  function  i 

if  the  function  is  a  vector,  read  three  values 

if  the  function  is  a  scalar,  read  one  value 
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void  elementsSurroundingPoint(  void ) 


{ 


int 

i  =0; 

int 

j  =0; 

int 

plndex  =  0; 

int 

elndex  =  0; 

INT.1D 

*tNESP  =  NULL; 

INT.1D 

*nESP  =  NULL; 

INTJD 

*eSP  =  NULL; 

//  i  is  a  counting  variable. 

// j  is  a  counting  variable. 

//  plndex  is  a  variable  used  to  store  a  point  index. 

//  elndex  is  a  variable  used  to  store  an  element  index. 

//  tNESP  is  a  temporary  array  for  constructing  nESP. 

//  nESP  contains  information  for  number  of  elements 
//  surrounding  a  point. 

//  eSP  contains  the  specific  elements  surrounding  each  points. 


nESP  =  new  INT _1  D[numberOfNodes+ 1  ] ;  //  Allocate  memory  for  actual  array. 
tNESP  =  new  INT_lD[numberOfNodes+l];  //  Allocate  memory  for  temporary  array, 
for  (i  =  0;  i  <=  numberOfNodes;  i++)  //  For  i  cycles  over  all  nodes  in  grid  plus  one. 

{ 

tNESP[i]  =  0;  //  Initialize  temporary  array  to  contain  zeros. 

nESPfi]  =  0;  //  Initialize  actual  array  to  contain  zeros. 

}  //  End  cycle  over  all  nodes  in  grid  plus  one. 


for  (i  =  0;  i  <  numTetrahedra;  i++) 
{ 

'  for  (j  =  0;  j  <  4;  j++) 

{ 

plndex  =  tetrahedra[i][j]; 
tNESP[pIndex]++; 

} 

} 


//  For  i  cycles  over  all  tetrahedra  in  the  grid. 

//  For  j  cycles  over  all  nodes  in  tetrahedra  i. 

//  Dereference  the  node  index. 

//  Increment  this  nodes  number  of  elements  by  one. 
//  End  cycle  j  over  all  nodes  in  tetrahedra  i. 

//  End  cycle  i  over  all  tetrahedra  in  the  grid. 


for  (i  =  0;  i  <  numPentsS;  i++) 
{ 

for  (j  =  0;  j  <  5;  j++) 

{ 

plndex  =  pents5[i][j]; 
tNESP[pIndex]++; 

} 


//  For  i  cycles  over  all  five  noded  pents  in  the  grid. 

//  For  j  cycles  over  all  nodes  in  five  noded  pent  i. 

//  Dereference  the  node  index. 

//  Increment  this  nodes  number  of  elements  by  one. 
//  End  cycle  j  over  all  nodes  in  five  noded  pent  i. 

//  End  cycle  i  over  all  five  noded  pents  in  the  grid. 


for  (i  =  0;  i  <  numPents6;  i++) 
{ 

for  (j  =  0;  j  <  6;  j++) 

{ 

plndex  =  pents6[i][j]; 
tNESP[pIndex]++; 

} 


//  For  i  cycles  over  all  six  noded  pents  in  the  grid. 

//  For  j  cycles  over  all  nodes  in  six  noded  pent  i. 

//  Dereference  the  node  index. 

//  Increment  this  nodes  number  of  elements  by  one. 
//  End  cycle  j  over  all  nodes  in  six  noded  pent  i. 

//  End  cycle  over  all  six  noded  pents  in  the  grid. 
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for  (i  =  0;  i  <  numHexahedra;  i++) 
{ 

for  (j  =  0;  j  <  8;  j++) 

{ 

plndex  =  hexahedra[i][j]; 
tNESP[pIndex]++; 

} 

j 


//  For  i  cycles  over  all  hexahedra  in  the  grid. 

//  For  j  cycles  over  all  nodes  in  the  hexahedra  i. 

//  Dereference  the  node  index. 

//  Increment  this  nodes  number  of  elements  by  one. 
//  End  cycle  j  over  all  nodes  in  hexahedra  i. 

//  End  cycle  over  all  hexahedra  in  the  grid. 


for  (i  =  1 ;  i  <=  numberOfNodes;  i++)  //  For  i  cycles  over  all  nodes  plus  one  starting 

//  at  one. 

nESP[i]  =  nESP[i-l]  +  tNESP[i-l];  //  Increment  nESP[i]  to  indicate  all  elements 

//  surrounding  a  point  up  to  that  point. 

delete[]  tNESP;  //  Deallocate  the  memory  needed  for  this 

//  temporary  variable. 


eSP  =  new  INT_lD[nESP[numberOfNodes]];  //  Allocate  the  memory  for  actual  element 

//  indices. 


for  (i  =  0;  i  <  numTetrahedra;  i++) 
{ 

for  (j  =  0;  j  <  4;  j++) 

{ 

plndex  =  tetrahedra[i][j]; 
elndex  =  nESP[pIndex]; 
eSP[eIndex]  =  tetlndex+i; 

nESP[pIndex]++; 

} 

} 


//  For  i  cycles  over  all  tetrahedra  in  the  grid. 

//  For  j  cycles  over  all  nodes  in  tetrahedra  i. 

//  Dereference  the  node  index. 

//  Dereference  the  current  element  index. 

//  Add  this  element  to  the  list  surrounding  node 
//  plndex. 

//  Increment  this  nodes  number  of  elements  by  one. 
//  End  cycle  j  over  all  nodes  in  tetrahedra  i. 

//  End  cycle  i  over  all  tetrahedra  in  the  grid. 


for  (i  =  0;  i  <  numPents5;  i++) 

{ 

for  (J  =  0;  j  <  5;  j++) 

{ 

plndex  =  pents5[i][j]; 
elndex  =  nESP[pIndex]; 


//  For  i  cycles  over  all  five  noded  pents  in  the  grid. 
//  For  j  cycles  over  all  nodes  in  five  noded  pent  i. 


//  Dereference  the  node  index. 

//  Dereference  the  current  element  index. 
eSPfelndex]  =  pents5Index+i;  //  Add  this  element  to  the  list  surrounding  node 

//  plndex. 

nESP[pIndex]++;  //  Increment  this  nodes  number  of  elements  by  one. 

//  End  cycle  j  over  all  nodes  in  five  noded  pents  i. 

//  End  cycle  i  over  all  five  noded  pents  in  the  grid. 


144 


for  (i  =  0;  i  <  numPents6;  i++)  //  For  i  cycles  over  all  six  noded  pents  in  the  grid. 


{ 

for  (j  =  0;  j  <  6;  j++)  //  For  j  cycles  over  all  nodes  in  six  noded  pent  i. 

{ 

plndex  =  pents6[i] [j] ;  //  Dereference  the  node  index. 

elndex  =  nESP[pIndex];  //  Dereference  the  current  element  index. 

eSP[eIndex]  =  pents6Index+i;  //  Add  this  element  to  the  list  surrounding  node 

//  plndex. 

nESP[pIndex]++;  //  Increment  this  nodes  number  of  elements  by  one. 

}  //  End  cycle  j  over  all  nodes  in  six  noded  pents  i. 

}  //  End  cycle  i  over  all  six  noded  pents  in  the  grid. 


for  (i  =  0;  i  <  numHexahedra;  i++) 
{ 

for  (j  =  0;  j  <  8;  j++) 

{ 

plndex  =  hexahedra[i][j]; 
elndex  =  nESP[pIndex]; 
eSP[eIndex]  =  hexlndex+i; 

nESP[pIndex]++; 

} 

} 

} 


//  For  i  cycles  over  all  hexahedra  in  the  grid. 

//  For  j  cycles  over  all  nodes  in  hexahedra  i. 

//  Dereference  the  node  index. 

//  Dereference  the  current  element  index. 

//  Add  this  element  to  the  list  surrounding  node 
//  plndex. 

//  Increment  this  nodes  number  of  elements  by  one. 
//  End  cycle  j  over  all  nodes  in  hexahedra  i. 

//  End  cycle  i  over  all  hexahedra  in  the  grid. 

//  End  compute  elements  surrounding  point. 
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void  buiIdElementNeighborMap(  void ) 

{ 

int  i  =0;  //  i  is  a  counting  variable. 

int  j  =0;  //j  is  a  counting  variable. 

int  pi  =0;  //pi  is  used  as  anode  index. 

int  p2  =  0;  //  p2  is  used  as  a  node  index. 

int  p3  =  0;  //  p3  is  used  as  a  node  index. 

int  p4  =  0;  //  p4  is  used  as  a  node  index. 

int  p5  =  0;  //  p5  is  used  as  a  node  index. 

int  p6  =  0;  //  p6  is  used  as  a  node  index. 

int  ne  =0;  // ne  is  used  as  a  holder  for  a  matching  element. 

int  nf  =  0;  //  nf  is  used  as  a  holder  for  a  matching  face. 

int  ge  =  0;  //  ge  is  used  to  hold  the  global  element  index. 

ENT_6D  *eN=  NULL;  //eN  is  the  array  containing  explicit  element  neighbor  information. 

eN  =  new  INT_6D[numElements];  //  Allocate  memory  for  the  element  neighbor  array. 

for  (i  =  0;  i  <  numElements;  i++)  //  For  i  cycles  over  all  elements  in  the  grid. 

{ 

eN[i][0]  =  -555;  //  Initialize  the  neighbor  in  position  0  to  value  indicating  not  visited. 

eN[i][l]  =  -555;  //  Initialize  the  neighbor  in  position  1  to  value  indicating  not  visited. 

eN[i][2]  =  -555;  //  Initialize  the  neighbor  in  position  2  to  value  indicating  not  visited. 
eN[i][3]  =  -555;  //  Initialize  the  neighbor  in  position  3  to  value  indicating  not  visited. 
eN[i][4]  =  -555;  //  Initialize  the  neighbor  in  position  4  to  value  indicating  not  visited. 

eN[i][5]  =  -555;  //  Initialize  the  neighbor  in  position  5  to  value  indicating  not  visited. 

} 

for  (i  =  0;  i  <  numTetrahedra;  i++)  //  For  i  cycles  over  all  tetrahedra  in  the  grid. 

{ 

pi  =  tetrahedra[i][0];  //pi  contains  an  index  to  node  1  of  tetrahedra  l. 
p2  =  tetrahedra[i][l];  //  p2  contains  an  index  to  node  2  of  tetrahedra  i. 
p3  =  tetrahedra[i][2];  //  p3  contains  an  index  to  node  3  of  tetrahedra  i. 
p4  =  tetrahedra[i][3];  //  p4  contains  an  index  to  node  4  of  tetrahedra  i. 
ge  =  tetlndex+i;  //  ge  contains  the  global  element  index. 

if  (eN[ge][0]  ==  -555)  //  If  neighbor  1  for  tetrahedra  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  1  of  the  tetrahedra. 
commonElement(ge,p  1  ,p2,p3  ,&ne,&nf) ; 

eN[ge]  [0]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

//  End  if  neighbor  1  for  tetrahedra  i  has  not  been  visited. 


} 
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if  (eN[ge][l]  ==  -555)  //  If  neighbor  2  for  tetrahedra  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  2  of  the  tetrahedra. 
commonElement(ge,p2,p3,p4,&ne,&nf); 

eN[ge][l ]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  2  for  tetrahedra  i  has  not  been  visited. 

if  (eN[ge]  [2]  ==  -555)  //  If  neighbor  3  for  tetrahedra  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  3  of  the  tetrahedra. 
commonElement(ge,p3,p4,pl  ,&ne,&nf); 

eN[ge][2]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  3  for  tetrahedra  i  has  not  been  visited, 

if  (eN[ge]  [3]  ==  -555)  //  If  neighbor  4  for  tetrahedra  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  4  of  the  tetrahedra. 
commonElement(ge,p4,p2,p  1  ,&ne,&nf); 

eN[ge][3]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  4  for  tetrahedra  i  has  not  been  visited. 

}  //  End  for  i  cycles  over  all  tetrahedra  in  the  grid. 

for  (i  =  0;  i  <  numPents5;  i++)  //  For  i  cycles  over  all  five  noded  pents  in  the  grid. 

{ 

pi  =  pents5[i][0];  //  pi  contains  an  index  to  rode  1  of  five  noded  pent  i. 

p2  =  pents5[i][l];  //  p2  contains  an  index  to  °  jue  2  of  five  noded  pent  i. 

p3  =  pents5[i][2];  //  p3  contains  an  index  to  node  3  of  five  noded  pent  i. 

p4  =  pents5[i] [3];  //  p4  contains  an  index  to  node  4  of  five  noded  pent  i. 

p5  =  pents5[i][4];  //  p5  contains  an  index  to  node  5  of  five  noded  pent  i. 

ge  =  pents5Index+i;  //  ge  contains  the  global  element  index. 

if  (eN[ge][0]  ==  -555)  //  If  neighbor  1  for  five  noded  pent  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  1  of  the  five  noded  pent. 
commonElement(ge,p  1  ,p2,p3  ,&ne,&nf) ; 

eN[ge][0]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 
if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

//  End  if  neighbor  1  for  five  noded  pent  i  has  not  been  visited. 


} 
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if  (eN[ge][l]  ==  -555)  //  If  neighbor  2  for  five  noded  pent  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  2  of  the  five  noded  pent. 
commonElement(ge,p2,p5,p3  ,&ne,&nf) ; 

eN[ge] [  1  ]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  informat'on  for  the  neighbor  also. 

}  //  End  if  neighbor  2  for  five  noded  pent  i  has  not  been  visited, 

if  (eN[ge][2]  ==  -555)  //  If  neighbor  3  for  five  noded  pent  i  has  not  been  visited. 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  3  of  the  five  noded  pent. 
commonElement(ge,p5  ,p4,p3  ,&ne,&nf) ; 

eN[ge][2]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  3  for  five  noded  pent  i  has  not  been  visited. 


if  (eN[ge][3]  ==  -555)  //  If  neighbor  4  for  five  noded  pent  i  has  not  been  visited. 

{ 


//  Find  the  neighbor  element  and  face  for  neighbor  4  of  the  five  noded  pent. 
commonElement(ge,p3  ,p4,p  1  ,&ne,&nf); 

eN[ge][3]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  3  for  five  noded  pent  i  has  not  been  visited, 

if  (eN[ge][4]  ==  -555)  //  If  neighbor  5  for  five  noded  pent  i  has  not  been  visited, 

{ 


//  Find  the  neighbor  element  and  face  for  neighbor  5  of  the  five  noded  pent. 
commonElement(ge,pl  ,p4,p5,p2,&ne  ,&nf); 

eN[ge][4]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  5  for  five  noded  pent  i  has  not  been  visited. 

}  //  End  for  i  cycles  over  all  five  noded  pent  in  the  grid, 

for  (i  =  0;  i  <  numPents6;  i++)  //  For  i  cycles  over  all  six  noded  pents  in  the  grid. 

{ 


Pi 

p2 

p3 

p4 

p5 

p6 

ge 


pents6[i][0]; 

pents6[i][l]; 

pents6[i][2]; 

pents6[i][3]; 

pents6[i][4]; 

pents6[i][5]; 

pents6Index+i; 


//  pi  contains  an  index  to  node  1  of  six  noded  pent  i.  r 
//  p2  contains  an  index  to  node  2  of  six  noded  pent  i. 

//  p3  contains  an  index  to  node  3  of  six  noded  pent  i. 

//  p4  contains  an  index  to  node  4  of  six  noded  pent  i. 

//  p5  contains  an  index  to  node  5  of  six  noded  pent  i. 

//  p6  contains  an  index  to  node  6  of  six  noded  pent  i. 

//  ge  contains  the  global  element  index. 
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if  (eN[ge][0]  ==  -555)  //  If  neighbor  1  for  six  noded  pent  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  1  of  the  six  noded  pent. 
commonElement(ge,pl,p2,p3,&ne,&nf); 

eN[ge][0]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  1  for  six  noded  pent  i  has  not  been  visited, 

if  (eN[ge][l]  ==  -555)  //  If  neighbor  2  for  six  noded  pent  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  2  of  the  six  noded  pent. 
commonElement(ge,p2,p3  ,p6  ,p5,&ne,&nf) ; 

eN[ge][l]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  2  for  six  noded  pent  i  has  not  been  visited, 

if  (eN[ge][2]  ==  -555)  //  If  neighbor  3  for  six  noded  pent  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  3  of  the  six  noded  pent. 
commonElement(ge,p4,p6,p5,&ne,&nf); 

eN[ge][2]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  3  for  six  noded  pent  i  has  not  been  visited. 


if  (eN[ge][3]  ==  -555)  //  If  neighbor  4  for  six  noded  pent  i  has  not  been  v;sited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  4  of  the  six  noded  pent. 
commonElement(ge,p4,p  1  ,p3,p6,&ne,&nf); 

eN[ge][3]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  3  for  six  noded  pent  i  has  not  been  visited, 

if  (eN[ge][4]  ==  -555)  //  If  neighbor  5  for  six  noded  pent  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  5  of  the  six  noded  pent. 
commonEIement(ge,p2,p  1  ,p4,p5  ,&ne,&nf) ; 

eN[ge][4]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  5  for  six  noded  pent  i  has  not  been  visited. 

//  End  for  i  cycles  over  all  six  noded  pent  in  the  grid. 


} 
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for  (i  =  0;  i  <  numHexahedra;  i++)  //  For  i  cycles  over  all  hexahedra  in  the  grid. 

{ 

pi  =  hexahedra[i][0];  //  pi  contains  an  index  to  node  1  of  hexahedra  i. 

p2  =  hexahedra[i]  [1  ] ;  //  p2  contains  an  index  to  node  2  of  hexahedra  i. 

p3  =  hexahedra[i][2];  //  p3  contains  an  index  to  node  3  of  hexahedra  i. 

p4  =  hexahedra[i][3];  //  p4  contains  an  index  to  node  4  of  hexahedra  i. 

p5  =  hexahedra[il[^];  //  p5  contains  an  index  to  node  5  of  hexahedra  i. 

p6  =  hexahedra[i][5];  //  p6  contains  an  index  to  node  6  of  hexahedra  i. 

p7  =  hexahedra[i][6];  //  p7  contains  an  index  to  node  7  of  hexahedra  i. 

p8  =  hexahedra[i][7];  //  p8  contains  an  index  to  node  8  of  hexahedra  i. 

ge  =  hexlndex+i;  //  ge  contains  the  global  element  index. 

if  (eN[ge][0]  ==  -555)  //  If  neighbor  1  for  hexahedra  i  has  not  been  visited. 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  1  of  the  hexahedra. 
commonElement(ge,p  1  ,p2,p3  ,p4,&ne,&nf) ; 

eN[ge][0]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  1  for  hexahedra  i  has  not  been  visited, 

if  (eN[ge][l  ]  ==  -555)  //  If  neighbor  2  for  hexahedra  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  2  of  the  hexahedra. 
commonElement(ge,p3,p7,p6,p2,&ne,&nf); 

eN[ge][l]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  2  for  hexahedra  i  has  not  been  visited, 

if  (eN[ge][2]  =  -555)  //  If  neighbor  3  for  hexahedra  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  3  of  the  hexahedra. 
commonElement(ge,p6,p5,p8,p7,&ne,&nf); 

eN[ge][2]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  3  for  hexahedra  i  has  not  been  visited. 

if  (eN[ge][3)  ==  -555)  //  If  neighbor  4  for  hexahedra  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  4  of  the  hexahedra. 
commonElement(ge,pl,p4,p8,p5,&ne,&nf); 

eN[ge][3]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

//  End  if  neighbor  3  for  hexahedra  i  has  not  been  visited. 


} 
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if  (eN[ge][4]  ==  -555)  //  If  neighbor  5  for  hexahedra  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  5  of  the  hexahedra. 
commonElement(ge,p4,p3  ,p7  ,p8  ,&ne,&nf) ; 

eN[ge][4]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  5  for  hexahedra  i  has  not  been  visited, 

if  (eN[ge][5]  =  -555)  //  If  neighbor  6  for  hexahedra  i  has  not  been  visited, 

{ 

//  Find  the  neighbor  element  and  face  for  neighbor  6  of  the  hexahedra. 
commonElement(ge,pl,p5,p6,p2,&ne,&nf); 

eN[ge][5]  =  ne;  //  Record  the  information  in  the  element  neighbor  array, 

if  (ne  !=  -999)  //  If  the  neighbor  element  indicates  an  actual  face, 

eN[ne][nf]  =  ge;  //  Save  the  information  for  the  neighbor  also. 

}  //  End  if  neighbor  6  for  hexahedra  i  has  not  been  visited. 

}  //  End  for  i  cycles  over  all  hexahedra  in  the  grid. 

}  //  End  buildElementNeighborMap. 


void  commonElement(  int  insideThisElement,  //  The  global  id  for  the  element  we  are  inside. 

int  pi ,  //  The  first  point  index  for  the  face  we  want  to  match, 
int  p2,  //  The  second  point  index  for  the  face  we  want  to  match, 
int  p3,  //  The  third  point  index  for  the  face  we  want  to  match, 
int  *neighborElement,  //  Returns  the  matching  element  id. 
int  *neighborElementFace )  //  Returns  the  matching  face  id. 


if  (nESP  ==  NULL  or  eSP  ==  NULL) 
createElementsSurroundingPoint(); 
*neighborElement  =  -999; 
*neighborElementFace  =  -999; 


//  Check  to  see  if  maps  are  created. 

//  If  not,  then  create  it. 

//  Initialize  return  value  to  -999. 

//  Do  the  same  with  neighborElementFace. 


int 

i 

=  0; 

int 

nl 

=  0; 

int 

done 

=  0; 

int 

slndex 

=  0; 

int 

positionPl 

=  0; 

int 

positionP2 

=  0; 

int 

positionP3 

=  0; 

if  (pi 

1=0) 

{ 


//  i  is  used  as  a  holder. 

//  nl  is  the  number  of  elements  surrounding  point  pi. 

//  done  is  a  variable  used  as  a  stopping  condition. 

//  slndex  is  the  starting  index  for  pi  into  eSP. 

//  positionPl  will  contain  the  position  on  the  matching  face. 
//  positionP2  will  contain  the  position  on  the  matching  face. 
//  positionP3  will  contain  the  position  on  the  matching  face. 


nl  =  nESP[pl];  //  nl  is  set  to  nESP[0]  if  pi  is  0. 
slndex  =  0;  //  slndex  is  also  set  to  0. 


} 
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else 

{ 

nl  =  nESP[pl]  -  nESP[pl-l];  //  nl  is  set  by  finding  difference  in  values  in  nESP. 
slndex  =  nESP[pl-l];  //  slndex  is  the  value  at  index  pl-1. 

} 


i  =  slndex;  //  Let  i  cycle  over  all  elements, 
done  =  0;  //  Initialize  stopping  condition  to  0. 

while  (!done  and  i  <  slndex+n)  //  Cycle  until  a  match  or  no  more  elements  to  check. 

{ 

positionPl  =  -1 ;  //  Initialize  positionPl . 

positionP2  =  - 1 ;  //  Initialize  positionP2. 

positionP3  =  -1 ;  //  Initialize  positionP3. 

if  (eSP[i]  !=  insideThisElement)  //  Check  for  found  a  match. 

{ 

if  (eSP[i]  >=  tetlndex  and  eSP[i]  <  pents5Index)  //  Check  for  match  in  tetrahedra. 

{ 

if  (tetrahedra[eSP[i]][0]  ==  pi) 
positionPl  =  0; 

else  if  (tetrahedra[eSP[i]][l]  ==  pi) 
positionPl  =  1; 

else  if  (tetrahedra[eSP[i]][2]  ==  pi) 
positionPl  =  2; 

else  if  (tetrahedra[eSP[i]][3]  ==  pi) 
positionPl  =  3; 

if  (tetrahedra[eSP[i]][0]  ==  p2) 
positionP2  =  0; 

else  if  (tetrahedra[eSP[i]][l]  ==  p2) 
positionP2=  1; 

else  if  (tetrahedra[eSP[i]][2]  ==  p2) 
positionP2  =  2; 

else  if  (tetrahedra[eSP[i]][3]  ==  p2) 
positionP2  =  3; 

if  (tetrahedra[eSP[i]][0]  ==p3) 
positionP3  =  0; 

else  if  (tetrahedra[eSP[i]][l]  ==  p3) 
positionP3  =  1; 

else  if  (tetrahedra[eSP[i]][2]  ==  p3) 
positionP3  =  2; 

else  if  (tetrahedra[eSP[i]][3]  ==  p3) 


positionP3  =  3; 


if  (positionPl  >=  0  and  positionP2  >=  0  and  positionP3  >=  0) 

{ 

*neighborElement  =  eSP[i]; 

if  (positionPl  !=  3  and  positionP2  !=  3  and  positionP3  !=  3) 
*neighborElementFace  =  0; 

else  if  (positionPl  !=  0  and  positionP2  !=  0  and  positionP3  !=  0) 
*neighborElementFace  =  1 ; 

else  if  (positionPl  !=  1  and  positionP2  !=  1  and  positionP3  !=  1) 
*neighborElementFace  =  2; 

else 

*neighborElementFace  =  3; 
done  =  1 ;  //  Set  stopping  condition  to  1 . 

} 

}  //End  if  (eSP[i]  >=  tetlndex  and  eSP[i]  <  pents5Index) 
else  if  (eSP[i]  >=  pents5Index  and  eSP[i]  <  pents6Index) 

{ 

if  (pents[eSP[i]-pents5Index][0]  ==  pi) 
positionPl  =  0; 

else  if  (pents[eSP[i]-pents5Index][l]  ==  pi) 
positionPl  =  1; 

else  if  (pents[eSP[i]-pents5Index][2]  ==  pi) 
positionPl  =  2; 

else  if  (pents[eSP[i]-pents5Index][3]  ==  pi) 
positionPl  =  3; 

else  if  (pents[eSP[i]-pents5Index][4]  ==  pi) 
positionPl  =  4; 

if  (pents[eSP[i]-pents5Index][0]  ==  p2) 
positionP2  =  0; 

else  if  (pents[eSP[i]-pents5Index][l]  ==  p2) 
positionP2  =  1 ; 

else  if  (pents[eSP[i]-pents5Index][2]  ==  p2) 
positionP2  =  2; 

else  if  (pents[eSP[i]-pents5Index][3]  ==  p2) 
positionP2  =  3; 

else  if  (pents[eSP[i]-pents5Index][4]  ==  p2) 
positionP2  =  4; 

if  (pents[eSP[i]-pents5Index][0]  ==  p3) 
positionP3  =  0; 

else  if  (pents[eSP[i]-pents5Index][l]  ==  p3) 
positionP3  =  1 ; 


else  if  (pents[eSP[i]-pents5Index][2]  ==  p3) 
positionP3  =  2; 

else  if  (pents[eSP[i]-pents5Index][3]  ==  p3) 
positionP3  =  3; 

else  if  (pents[eSP[i]-pents5Index][4]  ==  p3) 
positionP3  =  4; 

if  (positionPl  >=  0  and  positionP2  >=  0  and  positionP3  >=  0) 

{ 

*neighborElement  =  eSP[i]; 

if  (positionPl  !=  4  and  positionP2  !=  4  and  positionP3  !=  4  && 
positionPl  !=  3  and  positionP2  !=  3  and  positionP3  !=  3) 

{ 

*neighborElementFace  =  0; 
done  = 1; 

} 

else  if  (positionPl  !=  0  and  positionP2  !=  0  and  positionP3  !=  0  && 
positionPl  !=  3  and  positionP2  !=  3  and  positionP3  !=  3) 

{ 

*neighborElementFace  =  1 ; 
done  = 1 ; 

}  ...  , 

else  if  (positionPl  !=  0  and  positionP2  !=  0  and  positionP3  !=  0  && 

positionPl  !=  1  and  positionP2  !=  1  and  positionP3  !=  1) 

{ 

*neighborElementFace  =  3 ; 
done  = 1 ; 

} 

else  if  (positionPl  !=  1  and  positionP2  !=  1  and  positionP3  !=  1  && 
positionPl  !=  4  and  positionP2  !=  4  and  positionP3  !=  4) 

{ 

*neighborElementFace  =  4; 
done  = 1 ; 

} 

}  //  End  if  (positionPl  >=  0  and  positionP2  >=  0  and  positionP3  >=  0) 
}  //  End  else  if  (eSP[i]  >=  pents5Index  and  eSP[i]  <  pents6Index) 

}  //  End  if  (eSP[i]  !=  insideThisElement) 

i++;  //  Increment  to  the  next  element  id  in  the  list. 

}  //  End  while  loop  to  cycle  for  a  match. 

}  //  End  commonElement  routine. 
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void  create VolumeChunking(  int  numXDivisions,  //  Number  of  X  subdivisions. 

int  numYDivisions,  //  Number  of  Y  subdivisions, 
int  numZDivisions )  //  Number  of  Z  subdivisions. 


int 

i 

=  0 

//  i  is  used  as  a  counting  variable. 

int 

j 

=  0 

// j  is  used  as  a  counting  variable. 

int 

k 

=  0 

//  k  is  used  as  a  counting  variable. 

int 

numVolumes 

=  numXDivisions*numYDivisions*numZDivisions; 

int 

vlndex 

=  0 

//  Used  as  an  index  into  chunking  structure. 

int 

P1 

=  0 

//  Used  to  contain  the  index  for  point  1 . 

int 

p2 

=  0 

//  Used  to  contain  the  index  for  point  2. 

int 

p3 

=  0 

//  Used  to  contain  the  index  for  point  3. 

int 

p4 

=  0 

//  Used  to  contain  the  index  for  point  4. 

int 

plXIndex 

=  0 

//  The  X  direction  index  for  point  1 . 

int 

plYIndex 

=  0 

//  The  Y  direction  index  for  point  1 . 

int 

plZIndex 

=  0 

//  The  Z  direction  index  for  point  1 . 

INT_1D 

*tetsInVolume 

=  NULL; 

//  Contains  element  indices  into  chunks. 

INT.ID 

*tNumTetsIn  Volume 

=  NULL; 

//  Temp  for  constructing  numTetsIn  Volume. 

INTJD 

*numTetsIn  Volume 

=  NULL; 

//  Contains  number  of  tets  in  chunk. 

double 

xlnc 

=  0.0; 

//  xlnc  is  the  increment  in  the  X  direction. 

double 

ylnc 

=  0.0; 

//  ylnc  is  the  increment  in  the  Y  direction. 

double 

zinc 

=  0.0; 

//  zinc  is  the  increment  in  the  Z  direction. 

DOUBLE  JD  Min; 

//  Contains  min  bounding  box  information. 

D0UBLE3DMax; 

//  Contains  max  bounding  box  information. 

getBoundingBox(Min,Max);  //  Get  the  bounding  box  extremes. 


xlnc  =  (Max[0]-Min[0])/(numXDivisions*1.0); 
ylnc  =  (Max[l  ]-Min[  1  ])/(numYDivisions*  1 .0); 
zinc  =  (Max[2]-Min[2])/(numZDivisions*1.0); 


//  Calculate  the  x  increment. 
//  Calculate  the  y  increment. 
//  Calculate  the  z  inclement. 


numTetsIn Volume  =  new  INT_lD[numVolumes+l];  //  Allocate  memory. 
tNumTetsIn Volume  =  new  INT.1D  [numVolumes];  //  Allocate  temporary  array, 
for  (i  =  0;  i  <  numVolumes;  i++)  //  Cycle  over  all  volumes. 

tNumTetsIn Volume[i]  =  0;  //  Initialize  all  in  array  to  0. 


//  Cycle  over  all  tetrahedra  in  the  volume  grid, 
for  (i  =  0;  i  <  numberOfVolumeTetrahedra;  i++) 

{ 


pi  =  tetrahedra[i][0] 
p2  =  tetrahedra[i][l] 
p3  =  tetrahedra[i][2] 
p4  =  tetrahedra[i][3] 


//Point  index  for  PI. 
//  Point  index  for  P2. 
//  Point  index  for  P3. 
//  Point  index  for  P4. 
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//  Calculate  the  X  index  for  point  PI. 

p  1  XIndex  =  (int)(((coordi  nates  [p  1  ]  [0]  -Min  [0] )/ (Max  [0]-Min  [0] ))  *numXDivisions); 
//  Calculate  the  Y  index  for  point  PI . 

p  1  YIndex  =  (int)(((coordinates  [pi  ]  [  1  ]-Min  [1  ])/(Max[  1  ]-Min[  l]))*numYDivisions); 
//  Calculate  the  Z  index  for  point  PI . 

plZIndex  =  (int)(((coordinates[pl][2]-Min[2])/(Max[2]-Min[2]))*numZDivisions); 
//  If  the  calculated  X  index  is  at  the  edge  of  the  volnme,  decrement  it. 
if  (pi XIndex  ==  numXDi visions) 
pi  XIndex-; 

//  If  the  calculated  Y  index  is  at  the  edge  of  the  volume,  decrement  it. 
if  (pi  YIndex  ==  numYDivisions) 
pi  YIndex-; 

//  If  the  calculated  Z  index  is  at  the  edge  of  the  volume,  decrement  it. 
if  (plZIndex  ==  numZDi visions) 
plZIndex-; 

//  Calculate  the  index  into  the  volume  chunking  structure, 
vlndex  =  plXIndex*numYDivisions*numZDivisions  + 
pi  YIndex*numZDivisions  + 
plZIndex; 

tNumTetsIn  Volume[vIndex]++;  //  Increment  num  tets  in  volume. 

}  //  End  cycle  over  all  tetrahedra  in  input  grid. 

Initialize  the  first  value  in  the  array  to  0. 
numTetsInVolume[0]  =  0; 

//  Set  the  values  in  the  actual  array  in  the  manner  similar  to 
//  nESP  shown  in  the  construction  of  elements  surrounding  a  point, 
for  (i  =  1;  i  <  =  numVolumes;  i++) 

numTetsInVolume[i]  =  numTetsInVolume[i-l]  + 
tNumTetsIn  Volume[i- 1  ] ; 

//  Deallocate  the  memory  just  used  in  construction  of  numTetsIn Volume. 
delete[]  tNumTetsIn  Volume; 


//  Allocate  the  actual  memory  that  is  the  volume  chunking  structure, 
tetsln  Volume  =  new  INT_lD[numberOfVolumeTetrahedra]; 

//  Cycle  over  all  tetrahedra  in  the  volume  grid, 
for  (i  =  0;  i  <  numberOfVolumeTetrahedra;  i++) 

{ 


pi  =  tetrahedra[i][0]; 
p2  =  tetrahedra[i][l]; 
p3  =  tetrahedra[i][2]; 
p4  =  tetrahedra[i][3]; 


//  Point  index  for  PI . 
//  Point  index  for  P2. 
//  Point  index  for  P3. 
//  Point  index  for  P4. 
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//  Calculate  the  X  index  for  point  PI . 

plXIndex  =  (int)(((coordinates[pl][0]-Min[0])/(Max[0]-Min[0]))*numXDivisions); 
//  Calculate  the  Y  index  for  point  PI . 

p  1  YIndex  =  (int)(((coordinates[p  1  ]  [  1  ]-Min[  1  ])/(Max  [  1  ]-Min[  1  ]  ))*num  YDivisions); 
//  Calculate  the  Z  index  for  point  PI. 

plZIndex  =  (int)(((coordinates[pl][2]-Min[2])/(Max[2]-Min[2]))*numZDivisions); 
//  If  the  calculated  X  index  is  at  the  edge  of  tik  volume,  decrement  it. 
if  (plXIndex  ==  numXDivisions) 
plXIndex-; 

//  If  the  calculated  Y  index  is  at  the  edge  of  the  volume,  decrement  it. 
if  (pi  YIndex  ==  numYDivisions) 
pi  YIndex-; 

//  If  the  calculated  Z  index  is  at  the  edge  of  the  volume,  decrement  it. 
if  (plZIndex  ==  numZDivisions) 
plZIndex-; 

//  Calculate  the  index  into  the  volume  chunking  structure, 
vlndex  =  plXIndex*numYDivisions*numZDivisions  + 
pi  YIndex*numZDivisions  + 
plZIndex; 

//  Place  this  element  i  into  the  chunking  structure  indicated  by  vlndex. 
tetsInVolume[numTetsInVolume[vIndex]]  =  i; 

//  Increment  the  number  of  elements  in  this  volume  chunk  by  one. 
numTetsInVolume[vIndex]++; 

}  //  End  cyle  over  all  tetrahedra  in  input  grid. 

}  //  End  createVolumeChunking. 
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void  getEIementContainingPoint(  double  x,  //  x  value  of  the  point. 

double  y,  //  y  value  of  the  point, 
double  z,  //  z  value  of  the  point, 
int  *e,  //  Contains  the  element  found  upon  return. 

//  May  also  contain  starting  element  for  search. 
INT_1D  * visited,  //  Contains  visit  info  for  each  element, 
int  visit,  //  Current  visit  id. 

Boolean  useChunking,  //  Flag  to  tell  routine  that 

//  volume  chunking  used  to  set 
//  starting  element. 

int  bruteForce )  //  Flag  to  tell  routine  that  if  element 
//  not  found  in  search,  use  brute  force 
//  searching  to  cycle  through  all  elements 
//  not  visited. 


{ 

int 

int 

int 

int 

int 

int 

int 

double 

double 

double 

double 


xlndex  =0; 
y  Index  =  0; 
zlndex  =0; 
vlndex  =  0; 
slndex  =0; 
elndex  =0; 
tetlndex=  0; 
VI  =0.0 

V2  =  0.0 

V3  =  0.0 

V4  =  0.0 


//  The  xlndex  for  the  given  point. 

//  The  ylndex  for  the  given  point. 

//  The  zlndex  for  the  given  point. 

//  vlndex  is  the  index  into  volume  chunking. 

//  slndex  is  the  index  into  numTetsIn Volume. 

//  elndex  is  the  index  into  numTetsIn  Volume. 

//  Place  holder  for  possible  containing  element. 
//  Sub  volume  1  of  candidate  tetrahedra. 

//  Sub  volume  2  of  candidate  tetrahedra. 

//  Sub  volume  3  of  candidate  tetrahedra. 

//  Sub  volume  4  of  candidate  tetrahedra. 


DOUBLE  _3D  Min;  //  Minimum  value  of  bounding  volume. 

DOUBLE3D  Max;  //  Maximum  value  of  bounding  volume. 

//  The  volume  chunking  is  based  on  the  bounding  box  of  all  input  data. 
getBoundingBox(Min,Max); 


if  (useChunking) 

{ 

//  Calculate  the  x  index  for  the  x  value  sent  into  the  search, 
xlndex  =  (int)(((x-Min[0])/(Max[0]-Min[0]))*numXDivisions); 
//  Calculate  the  y  index  for  the  y  value  sent  into  the  search, 
ylndex  =  (int)(((y-Min[l])/(Max[l]-Min[l]))*numYDivisions); 
//  Calculate  the  z  index  for  the  z  value  sent  into  the  search, 
zlndex  =  (int)(((z-Min[2])/(Max[2]-Min[2]))*numZDivisions); 
//  If  the  xlndex  is  at  the  edge  of  the  volume,  decrement  it. 
if  (xlndex  ==  numXDivisions) 
xlndex-; 


//  If  the  ylndex  is  at  the  edge  of  the  volume,  decrement  it. 
if  (ylndex  ==  numYDivisions) 
ylndex-; 

//  If  the  zlndex  is  at  the  edge  of  the  volume,  decrement  it. 
if  (zlndex  ==  numZDivisions) 
zlndex-; 

//  Calculate  the  volume  index  into  the  volume  chunking  structure, 
vlndex  =  xIndex*numYDivisions*numZDivisions  + 
yIndex*numZDivisions  + 
zlndex; 

//  Initialize  start  and  end  indices  to  be  zero, 
slndex  =  0; 
elndex  =  0; 

//  If  the  volume  chunking  index  is  zero,  then  slndex  is  zero, 
if  (vlndex  ==  0) 
slndex  =  0; 

//  Else,  find  the  start  index  for  the  elements  in  this  volume  chunk, 
else 

slndex  =  numTetsInVolume[vIndex-l]; 

//  The  ending  index  is  the  value  at  vlndex. 
elndex  =  numTetsInVolume[vIndex]; 

//  If  slndex  and  elndex  are  the  same,  then  there  are  no  elements  in  this  chunk, 
if  (elndex  ==  slndex) 

*e  =  0; 

//  Else,  set  a  candidate  element  to  the  first  element  listed  in  the  chunk, 
else 

*e  =  tetsInVolume[sIndex]; 

} 

//  element  is  used  as  a  placeholder  for  the  current  element  index, 
element  =  *e; 

//  Call  the  recursive  search  algorithm  with  this  candidate  element  index. 

//  If  useChunking  is  False,  then  the  recursive  search  starts  with  an 
//  element  that  is  passed  in  through  e. 

recursiveSearch(x,y,z,element,e, Mound, visited,  visit,tolerance); 

//  If  the  containing  element  has  not  been  found  and  bruteForce  flag  is  True, 
if  (found  ==  0  and  bruteForce) 

{ 

//  Calculate  the  x  index  for  the  x  value  sent  into  the  search. 


xlndex  =  (int)(((x-Min[0])/(Max[0]-Min[0]))*numXDivisions); 

//  Calculate  the  y  index  for  the  y  value  sent  into  the  search, 
ylndex  =  (int)(((y-Min[l])/(Max[l]-Min[l]))*numYDivisions); 

//  Calculate  the  z  index  for  the  z  value  sent  into  the  search, 
zlndex  =  (int)(((z-Min[2])/(Max[2]-Min[2]))*numZDivisions); 

//  If  the  xlndex  is  at  the  edge  of  the  volume,  decrement  it. 
if  (xlndex  ==  numXDivisions) 
xlndex-; 

//  If  the  ylndex  is  at  the  edge  of  the  volume,  decrement  it. 
if  (ylndex  ==  numYDivisions) 
ylndex-; 

//  If  the  zlndex  is  at  the  edge  of  the  volume,  decrement  it. 
if  (zlndex  ==  numZDivisions) 
zlndex-; 

//  Calculate  the  volume  index  into  the  volume  chunking  structure, 
vlndex  =  xIndex*numYDivisions*numZDivisions  + 
yIndex*numZDivisions  + 
zlndex; 

//  Initialize  start  and  end  indices  to  be  zero, 
slndex  =  0; 
elndex  =  0; 

//  If  the  volume  chunking  index  is  zero,  then  slndex  is  zero, 
if  (vlndex  ==  0) 
slndex  =  0; 

//  Else,  find  the  start  index  for  the  elements  in  this  volume  chunk, 
else 

slndex  =  numTetsInVolume[vIndex-l]; 

//  The  ending  index  is  the  value  at  vlndex. 

elndex  =  numTetsInVolume[  vlndex]; 

//  Cycle  over  all  elements  in  this  volume  chunk, 
for  (i  =  slndex;  i  <  elndex;  i++) 

{ 

//  Dereference  the  index  for  this  element  in  this  chunk, 
tetlndex  =  tetsInVolume[i]; 


//  Check  this  element  only  if  it  has  not  already  been  visited, 
if  (visitedftetlndex]  !=  visit) 
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{ 

//  Set  the  visited  flag  for  this  element. 
visited[tetlndex]  =  visit; 

//  Check  whether  this  element  contains  the  given  point. 

found  =  doesElementContainThisPoint(x,y,z,tetIndex, tolerance, 

&V  1  ,&V2,&V3,&V4); 

if  (found) 

{ 

*e  =  tetlndex; 
return; 

} 

}  //  End  if  (visited[tetlndex]  !=  visit) 

}  //  End  for  (i  =  slndex;  i  <  elndex;  i++) 

*e = -999; 

}  //  End  if  (found  ==  0  and  bruteForce) 
else  if  (found  ==  0  and  IbruteForce) 

*e = -999; 

}  //  End  getElementContainingPoint. 
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void  recursiveSearch(  double  x,  //  x  value  of  point  given. 

double  y,  //  y  value  of  point  given, 
double  z,  //  z  value  of  point  given, 
int  element,  //  element  currently  inside, 
int  *e,  //  return  value  for  element  found, 
int  *found,  //  returns  whether  element  found. 

INT_1D  *visited,  //  array  containing  visited  flags  for  each  element, 
int  visit,  //  current  visit  id. 

double  tolerance )  //  tolerance  for  element  containment. 


double 

VI 

=  0.0; 

//  Sub  volume  one  of  the  given  tetrahedra. 

double 

V2 

=  0.0; 

//  Sub  volume  two  of  the  given  tetrahedra. 

double 

V3 

=  0.0; 

//  Sub  volume  three  of  the  given  tetrahedra. 

double 

V4 

=  0.0; 

//  Sub  volume  four  of  the  given  tetrahedra. 

Boolean  foundFlag  =  False;  //  Flag  to  determine  whether  element  found. 

//  Set  the  visited  flag  for  this  element  to  the  current  visit  id. 
visited[element]  =  visit; 

//  Check  whether  this  element  contains  the  given  point. 

foundFlag  =  doesElementContainThisPoint(x,y,z, element, tolerance, 

&  V 1  ,& V  2,& V  3  ,& V  4) ; 

//  If  this  element  contains  the  given  point,  we  are  done, 
if  (foundFlag) 

{ 

*e  =  element; 

*found  =s  1 ; 
return; 

} 

//  Else,  we  have  to  go  through  each  of  the  neighbors  connected  to  this  element, 
else 
{ 

//  If  the  calculated  sub  volume  is  negative  and  the  element  has  not  been  found, 
if  ( VI  <  0.0  and  *found  !=  1) 

{ 

//  If  neighbor  2  exists  and  it  has  not  already  been  visited, 
if  ( eN[element][l]  >=  0  and  visited[eN[element][l]]  !=  visit ) 

{ 

//  Call  the  recursive  searching  routine  with  element  neighbor  2. 
recursiveSearch(  x, 

y. 

z, 

eN[element][l], 

e, 

found, 

visited, 

visit, 
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tolerance ); 


//  If  the  calculated  sub  volume  is  negative  and  the  element  has  not  been  found, 
if  ( V2  <  0.0  and  *four.d  !=  1) 

{ 

//  If  neighbor  3  exists  and  it  has  not  already  been  visited, 
if  ( eN[element][2]  >=  0  and  visited[eN[element][2]]  !=  visit ) 

{ 

//  Call  the  recursive  searching  routine  with  element  neighbor  3. 
recursiveSearch(  x, 

y. 

z, 

eN[element][2], 

e, 

found, 

visited, 

visit, 

tolerance ); 

} 

} 


//  If  the  calculated  sub  volume  is  negative  and  the  element  has  not  been  found, 
if  (  V3  <  0.0  and  *found  !=  1) 

{ 

//  If  neighbor  4  exists  and  it  has  not  already  been  visited, 
if  ( eN[elemeni.lj3]  >=  0  and  visited[eN[element][3]]  !=  visit ) 

{ 

//  Call  the  recursive  searching  routine  with  element  neighbor  4. 
recursiveSearch(  x, 

y. 

Z, 

eN[element][3], 

e, 

found, 

visited, 

visit, 

tolerance ); 

} 

} 


//  If  the  calculated  sub  volume  is  negative  and  the  element  has  not  been  found, 
if  (  V4  <  0.0  and  *found  !=  1) 

{ 

//  If  neighbor  0  exists  and  it  has  not  already  been  visited, 
if  ( eN[element][0]  >=  0  and  visited[eN[element][0]]  !=  visit ) 

{ 

//  Call  the  recursive  searching  routine  with  element  neighbor  0. 
recursiveSearch(  x, 

y. 

z, 

eN[element][0], 

e, 

found, 

visited, 

visit, 

tolerance ); 

} 

} 

}  //  End  else. 

}  //  End  recursiveSearch. 
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Boolean  doesElementContainThisPoint(  double  x, 

double  y, 
double  z, 
int  element, 
double  tolerance, 
double  *V1, 
double  *V2, 
double  *V3, 
double  *V4 ) 


int 

Pi 

=  0; 

//pi  is  the  point  1  index. 

int 

p2 

=  0; 

//  p2  is  the  point  2  index. 

int 

p3 

=  0; 

//  p3  is  the  point  3  index. 

int 

p4 

=  0; 

//  p4  is  the  point  4  index. 

double  V 

=  0.0; 

//  V  is  the  volume  of  the  element. 

double  volmin 

=  0.0; 

//  volmin  holds  a  minimum  volume. 

double 

w 

=  0.0; 

//  w  is  a  weighting  factor. 

pi  =  tetrahedra[element][0]; 
p2  =  tetrahedra[element]  [  1  ] ; 
p3  =  tetrahedra[element][2]; 
p4  =  tetrahedra[element][3]; 

//  Calculate  the  volume  of  the  tetrahedra  indicated  by  element. 

V  =  calculate  Volume(coordinates  [pi], 

coordinates[p2], 
coordinates  [p3], 
coordinates[p4]); 

//VI  is  the  volume  created  by  the  given  point,  point  P2,  point  P3,  and  point  P4. 
*V  1  =  calculate  Volume(X,coordinates[p2],coordin.aes[p3],coordinates[p4]); 

//  V2  is  the  volume  created  by  the  given  point,  point  PI,  point  P4,  and  point  P3. 
*V2  =  calculateVolume(X,coordinates[pl],coordinates[p4],coordinates[p3]); 

//  V3  is  the  volume  created  by  the  given  point,  point  P4,  point  PI ,  and  point  P2. 
*V3  =  calculateVolume(X,coordinates[p4],coordinates[pl],coordinates[p2]); 

//Because  Vl+V2+V3+V4must  be  1.0,  then  V4  is  given  as, 

*V4  =  1.0  -  (*V1)  -  (*V2)  -  (*V3); 

//  Find  the  minimum  of  all  of  the  subvolumes  V 1 ,  V2,  V3,  and  V4. 
volmin  =  MIN  ((*V1),  MIN  ((*V2),  MIN  ((*V3),  (*V4)))); 


//  Weight  this  by  a  tolerance  multiplied  by  the  volume  of  the  element, 
w  =  volmin  +  tolerance  *  V; 
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if  (w  >  =  0.0)  //  If  the  weighting  is  positive, 

return  True;  //  this  element  contains  the  given  point, 
else  //  else, 

return  False;  //  this  element  does  not  contain  the  given  point. 

}  //  End  doesElementContainThisPoint. 


APPENDIX  G 

ALGORITHM  TO  CALCULATE  AN  ARBITRARY  CUTTING  PLANE  WITH  NO 

DUPLICATE  POINTS 
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void  calculateCuttingPlane(  DOUBLE  JD  point,  //  A  point  on  the  cutting  plane. 

DOUBLE  .3D  normal,  //  The  normal  to  the  cutting  plane, 
double  tolerance,  //  A  tolerance  for  intersections, 
int  *nAllocatedSurface,  //  Current  size  of  Tris. 
int  *nPts,  //  Number  of  points  in  extraced  surface. 

DOUBLE  JD  **Pts,  //  List  of  resulting  points, 
mt  *nTris,  //  Number  of  triangles  in  extraced  surface. 

MOD  **Tris,  //List  of  resulting  triangle  point  indices. 
INT.1D  **EFPts,  //  List  of  element  ids  that  correspond 
//  to  points  in  extracted  surface, 
int  gridlndex,  //  If  dealing  with  multiple  grids, 

//  then  gridlndex  will  vary. 

INT.1D  *nAllocatedElement,  //  Current  size  of  the  array  Elem. 
INT_1D  **nElem,  //  Number  of  intersected  elements  in  grid 
//  indicated  by  gridlndex. 

INTJD  **Elem )  //  List  of  intersected  element  ids  in 
//  extracted  surface. 


Boolean 

fl  =  False; 

//  Indicates  whether  face  1  of  the  current 

//  element  has  been  intersected. 

Boolean 

f2  =  False; 

//  Indicates  whether  face  2  of  the  current 

//  element  has  been  intersected. 

Boolean 

f3  =  False; 

//  Indicates  whether  face  3  of  the  current 

//  element  has  been  intersected. 

Boolean 

f4  =  False; 

//  Indicates  whether  face  4  of  the  current 

//  element  has  been  intersected. 

Boolean 

*eVisited 

=  NULL;  //  Array  to  contain  whether  element  has  been 

//  visited  in  surface  extraction  routine. 

int 

nElements 

=  0; 

//  The  number  of  elements 

int 

i 

=  0 

//  i  is  used  as  a  counting  variable. 

int 

j 

=  0 

// j  is  used  as  a  counting  variable. 

int 

k 

=  0 

//  k  is  used  as  a  counting  variable. 

int 

e 

=  0 

//  Contains  local  element  id. 

int 

plndex 

=  0 

//  plndex  contains  a  point  index. 

int 

elndex 

=  0 

//  elndex  contains  an  element  index. 

int 

inPts 

=  0 

//  The  local  number  of  intersection  points  per  element. 

int 

inTris 

=  0 

//  The  local  number  of  triangles  generated  from  element 
//  plane  intersection. 

int 

it[6]; 

//  Contains  whether  each  edge  of  the  tetrahedra  has 
//  been  intersected  or  not. 

int 

pit[6]; 

//  Contains  the  intersection  point  on  each  edge  of  the 
//  tetrahedra. 

int 

glndex[6]; 

//  Contains  global  index  of  intersection  point  in  the 
//  array  Pts. 
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int 

INT_1D 

INT.1D 
INT_1D 
INT  3D 


pglndex[6];  //  Contains  previous  intersection  points  global  index  in  the 


//  array  Pts. 

♦visited  =  NULL;  //  Array  to  contain  whether  element  has  been 
//  visited  for  use  in  point  containment. 

*eit  =NULL;  // Contains  all  elements  intersection  indicators. 

♦eglndex  =  NULL;  //  Contains  all  elements  global  index  values  in  the  array  Pts. 

*iTris  =  NULL;  //  Contains  the  local  triangles  cut  on  a  per  element  basis. 


double 

tO  =  0.0; 

double 

tl  =0.0; 

double 

t2  =  0.0; 

double 

t3  =  0.0; 

double 

tol  =  0.0; 

D0UBLE3D 

D0UBLE3D 

DOUBLE3D 

D0UBLE3D 

D0UBLE3D 

D0UBLE3D 

DOUBLE3D 

DOUBLE3D 

pO; 

pi; 

p2; 

p3; 

ppt[6]; 

pt[6]; 

*iPts  =  NULL; 
♦ept  =  NULL; 

//  Variable  to  determine  edge  crossings  for 
//  those  containing  point  0. 

//  Variable  to  determine  edge  crossings  for 
//  those  containing  point  1 . 

//  Variable  to  determine  edge  crossings  for 
//  those  containing  point  2. 

//  Variable  to  determine  edge  crossings  for 
//  those  containing  point  3. 

//  Local  cutting  tolerance  based  on  tolerance  passed 
//  into  routine. 

//  Variable  to  contain  values  for  point  0. 

//  Variable  to  contain  values  for  point  1 . 

//  Variable  to  contain  values  for  point  2. 

//  Variable  to  contain  values  for  point  3. 

//  Previous  elements  intersection  point. 

//  Local  elements  intersection  points. 

//  Local  elements  intersection  points. 

//  Contains  all  intersection  points  for  all  elements. 


//  Allocate  the  memory  for  recording  element  visits. 
eVisited  =  new  Boolean[numberOfVolumeTetrahedra]; 

//  Allocate  the  memory  for  element  containment  visits, 
visited  =  new  INT_lD[numberOfVolumeTetrahedra]; 

//  Allocate  the  memory  to  record  all  intersection  indicators  for  all  elements, 
eit  =  new  INT_lD[numberOfVolumeTetrahedra*6]; 

//  Allocate  the  memory  to  record  global  index  for  intersection  points  for  all  elements, 
eglndex  =  new  INT_1  D  [numberOfVolumeTetrahedra*6] ; 

//  Allocate  the  memory  to  handle  local  intersections  on  per  element  basis. 
iTris  =  new  ENT_3D[20]; 

//  Allocated  to  store  local  intersection  points  on  per  element  basis. 
iPts  =  new  DOUBLE_3D[20]; 

//  Allocated  to  record  all  intersection  points  for  all  elements, 

ept  =  new  DOUBLE_3D[numberOfVolumeTetrahedra*6]; 


i  =  0;  //  i  is  used  as  a  global  index. 

//  For  all  tetrahedra  in  the  input  grid,  initialize  global  structures. 
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for  (j  =  0;  j  <  numberOfVolumeTetrahedra*6;  j++) 

{ 

eit[i]  =  0;  //  Initialize  intersection  indicators  to  0. 

ept[i][0]  =  0.0;  //  Initialize  x  intersection  point  value  to  0.0. 

ept[i][l]  =  0.0;  //  Initialize  y  intersection  point  value  to  0.0. 

ept[i][2]  =  0.0;  //  Initialize  z  intersection  point  value  to  0.0. 

eglndex[i]  =  -999;  //  Initialize  global  index  to  0. 

i++;  //  Increment  the  global  index. 


*nPts  =  0;  //  Initialize  the  number  of  points  to  0. 

*nTris  =  0;  //  Initialize  the  number  of  triangles  to  0. 

for  (i  =  0;  i  <  gridlndex;  i++)  //  For  each  grid  that  is  being  cut, 

nElements  +=  (*nElem) [gridlndex];  //  find  the  total  number  of  elements. 

//  Cycle  over  all  tetrahedra  in  the  input  grid  and  initialize  visits, 
for  (i  =  0;  i  <  numberOfVolumeTetrahedra;  i++) 

{ 

visited[i]  =  0;  //  Initialize  visited  for  searching  all  to  visit  0. 

eVisited[i]  =  False;  //  Initialize  all  elements  to  not  visited. 

} 

getElementContainingPoint(  point[0],  //  X  value  of  point  on  the  cutting  plane. 

point[l],  //  Y  value  of  point  on  the  cutting  plane. 
point[2],  //  Z  value  of  point  on  the  cutting  plane. 

&e,  //  Element  containing  this  point  is  returned  in  e. 
visited,  //  Contains  current  visits  for  each  element. 

1,  //  The  visit  flag  for  this  round  of  searching. 

IE-5,  //  Provide  a  tolerance  for  element  containment. 
True,  //  Get  a  starting  element  using  chunking  scheme. 
True  );  //  If  point  is  not  found  locally,  search  globally. 

p0[0]  =  coordinates[tetrahedra[e][0]][0];  //  Set  x  value  of  point  0. 
p0[l]  =  coordinates[tetrahedra[e][0]][l];  //  Set  y  value  of  point  0. 
p0[2]  =  coordinates  [tetrahedra[e][0]]  [2];  //  Set  z  value  of  point  0. 

pi  [0]  =  coordinates[tetrahedra[e][l]][0];  //  Set  x  value  of  point  1. 
pi  [1]  =  coordinates[tetrahedra[e][l]][l];  //  Set  y  value  of  point  1. 
pi  [2]  =  coordinates[tetrahedra[e][l]][2];  //  Set  z  value  of  point  1. 


p2[0]  =  coordinates[tetrahedra[e][2]][0];  //  Set  x  value  of  point  2. 
p2[l]  =  coordinates[tetrahedra[e][2]][l];  //  Set  y  value  of  point  2. 
p2[2]  =  coordinates[tetrahedra[e][2]][2];  //  Set  z  value  of  point  2. 


p3[0]  =  coordinates [tetrahedra[e] [3 ]][0];  //  Set  x  value  of  point  3. 
p3[l]  =  coordinates[tetrahedra[e][3]][l];  //  Set  y  value  of  point  3. 
p3[2]  =  coordinates[tetrahedra[e][3]][2];  //  Set  z  value  of  point  3. 


tO  =  (p0[0]  -  point[0])  *  normal[0]  + 
(pO[l]  -  point[l])  *  normal[l]  + 
(p0[2]  -  point[2])  *  normaI[2]; 

tl  =  (p  1  [0]  -  point[0])  *  normal[0]  + 
(p  1  [  1  ]  -  point[l])  *  normal[l]  + 
(pi [2]  -  point[2])  *  normal[2]; 

t2  =  (p2[0]  -  point[0])  *  normal[0]  + 
(p2[l]  -  point[l])  *  normal[l]  + 
(p2[2]  -  point[2])  *  normal[2]; 

t3  =  (p3[0]  -  point[0])  *  normal  [0]  + 
(p3[l ]  -  point[l])  *  normal[l]  + 
(p3[2]  -  point[2])  *  normal[2]; 


//  Compute  the  dot  product 
//  of  vector  from  planar  point  to 
//  point  0  and  the  normal  to  the  plane. 

//  Compute  the  dot  product 
//  of  vector  from  plane  point  to 
//  point  0  and  the  normal  to  the  plane. 

//  Compute  the  dot  product 
//  of  vector  from  plane  point  to 
//  point  0  and  the  normal  to  the  plane. 

//  Compute  the  dot  product 
//  of  vector  from  plane  point  to 
//  point  0  and  the  normal  to  the  plane. 


//  Base  tolerance  on  the  max  edge  length  and  machine  precision, 
tol  =  (getMaxEdgeLength(e))*tolerance; 


//  If  the  cutting  plane  intersects  this  tetrahedra, 
if  ( (tO  >=  -tol  and  tl  <=  tol)  or  (tO  <=  tol  and  tl  >=  -tol)  or 
(tO  >=  -tol  and  t2  <=  tol)  or  (tO  <=  tol  and  t2  >=  -tol)  or 

(tO  >=  -tol  and  t3  <=  tol)  or  (tO  <=  tol  and  t3  >=  -tol)  or 

(tl  >=  -tol  and  t2  <=  tol)  or  (tl  <=  tol  and  t2  >=  -tol)  or 

(tl  >=  -tol  and  t3  <=  tol)  or  (tl  <=  tol  and  t3  >=  -tol)  or 

(t2  >=  -tol  and  t3  <=  tol)  or  (t2  <=  tol  and  t3  >=  -tol) ) 


{ 

for  (i  =  0;  i  <  6;  i++) 

{ 

Pt[i][0]  =  0.0; 
Pt[i][l]  =  0.0; 
Pt[i][2]  =  0.0; 
it[i]  =  0; 
glndex[i]  =  0; 


//  For  all  potential  intersection  points, 

//  Initialize  x  intersection  to  0.0. 

//  Initialize  y  intersection  to  0.0. 

//  Initialize  z  intersection  to  0.0. 

//  Initialize  intersection  indicator  to  0. 
//  Initialize  global  index  values  to  0. 


ppt[i][0]  =  0.0; 

PPt[i][l]  =  0.0; 
PPt[i][2]  =  0.0; 
pit[i]  =  0; 
pglndex[i]  =  0; 


//  Initialize  previous  x  intersection  to  0.0. 

//  Initialize  previous  x  intersection  to  0.0. 

//  Initialize  previous  x  intersection  to  0.0. 

//  Initialize  previous  intersection  indicator  to  0. 
//  Initialize  previous  global  index  values  to  0. 
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} 

//  Calculate  the  intersection  with  the  plane  and  the  tetrahedra. 
intersectPlaneWithTetrahedra(  e,  //  The  element  id  for  the  tetrahedra. 

point,  //  The  point  on  the  cutting  plane, 
normal,  //  The  normal  to  the  cutting  plane. 
pO,  //  Point  0  on  the  tetrahedra. 

pi ,  //  Point  1  on  the  tetrahedra. 

p2,  //  Point  2  on  the  tetrahedra. 

p3,  //  Point  3  on  the  tetrahedra. 

tO,  //  Cut  parameter  for  point  0. 

1 1 ,  //  Cut  parameter  for  point  1 . 

t2,  //  Cut  parameter  for  point  2. 

t3,  //  Cut  parameter  for  point  3. 

&f  1 ,  //  Returns  whether  face  1  is  cut  or  not. 

&f2,  //  Returns  whether  face  2  is  cut  or  not. 

&f3,  //  Returns  whether  face  3  is  cut  or  not. 

&f4,  //  Returns  whether  face  4  is  cut  or  not. 

tolerance,  //  Used  to  determine  whether  cut  or  not. 
&inPts,  //  Local  number  of  intersection  points. 
&iPts,  //  Local  intersection  points. 

&inTris,  //  Local  number  of  intersection  triangles. 
&iTris,  //  Local  intersection  triangles. 

0,  //  Current  index  into  the  array  containing 

//  all  intersection  points. 

pit,  //  Previous  elements  intersection  indicators, 
ppt,  //  Previous  elements  int.  point  values, 
pglndex,  //  Global  index  of  previous  intersections, 
it,  //  Current  elements  intersection  indicators 

//  (to  be  determined). 

pt,  //  Current  elements  int.  point  values 

//  (to  be  determined). 

glndex,  //  Current  global  index  in  global  pts.  array. 
False,  //  Flag  to  remove  duplicate  int. 

eN,  //  Element  neighbor  array. 

- 1 ,  //  Previous  element  index. 

-1 ,  //  Face  id  from  prev.  element, 

eit,  //  Int.  indicators  for  all  cut  elements, 
ept,  //  Int.  point  values  for  all  cut  elements, 
eglndex );  //  Global  pt.  indices  for  all  cut  elements. 

//  If  recording  the  current  element  information  would  exceed  allocated  memory, 

//  then  reallocate  more  memory, 
if  (*nAllocatedElement  ==  nElements) 
reallocate  memory 


(*Elem)[nElements++]  =  e; 
(*nElem)[gridIndex]++; 
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//  Store  the  element  that  has  just  been  intersected. 

//  Increment  the  number  of  elements  for  the  grid  that 
//  it  belongs  to. 


//  If  recording  the  current  triangle  information  would  exeed  allocated  memory, 
//  then  reallocate  more  memory, 
if  (*nAllocatedSurface  -=  inTris) 
reallocate  memory 

plndex  =  *nPts; 
for  (i  =  0;  i  <  inPts;  i++) 

{ 

(*Pts)[*nPts][0]  =  iPts[i][0]; 

(*Pts)[*nPts][l]  =  iPts[i][l]; 

(*Pts)[*nPts][2]  =  iPts[i][2]; 

(*EFPts)[*nPts]  =  e; 

(*nPts)++; 

} 


//  Store  the  current  index  as  of  now. 

//  For  all  intersection  points  just  found, 

//  Store  the  x  value  of  the  intersection  point. 

//  Store  the  y  value  of  the  intersection  point. 

//  Store  the  z  value  of  the  intersection  point. 

//  Store  the  element  that  has  just  been  cut. 

//  Increment  number  of  points  in  the  cutting  plane. 


for  (i  =  0;  i  <  inTris;  i++)  //  For  all  triangles  formed  from  the  cut, 

{ 

(*Tris)[*nTris][0]  =  iTris[i][0];  //  Store  first  point  index. 

(*Tris)[*nTris][l]  =  iTris[i][l];  //  Store  second  point  index. 
(*Tris)[*nTris][2]  =  iTris[i][2];  //  Store  third  point  index. 

(*nTris)++;  //  Increment  the  number  of  triangles  in  the 

//  cutting  plane. 


} 


//  Find  the  cut  elements  index  in  the  global  element  array, 
elndex  =  e*6; 


//  For  all  possible  intersection  points, 
for  (i  =  0;  i  <  6;  i++) 

{ 

//  If  this  intersection  indicator  says  edge  has  been  cut, 

if  (it[i]> 

{ 

ept[elndex][0]  =  pt[i][0];  //  Store  the  x  point  value. 
ept[elndex][l]  =  pt[i][l];  //  Store  the  y  point  value. 
ept[elndex][2]  =  pt[i][2];  //  Store  the  z  point  value. 
eit[elndex]  =  it[i];  //  Store  the  intersection  indicator. 

//  Store  the  global  index  for  the  point  into  the  Pts  array. 
eglndex[elndex]  =  glndex[i]; 
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ppt[i][0]  =  pt[i][0];//  Make  this  x  value  the  previous  x  value. 

Ppt[i][l]  =  pt[i] [  1  ] ; //  Make  this  y  value  the  previous  y  value. 
ppt[i][2]  =  pt [i] [2] ; // Make  this  z  value  the  previous  z  value. 
pit[i]  =  it[i];  //  Make  this  intersection  indicator  the  previous 
//  intersection  indicator. 

//  Make  this  global  index  the  previous  global  index. 

pglndex[i]  =  glndex[i]  =  plndex; 

plndex++;  //  Increment  the  current  index  into  Pts. 

} 

elndex++;  //  Increment  index  into  global  information  array. 

} 

eVisited[e]  =  True;  //  Set  this  element  as  having  been  visited. 

recursivelyCalculateCut(  point,  //  The  point  in  the  cutting  plane. 

normal,  //  The  normal  to  the  cutting  plane, 
tolerance,  //  User  defined  tolerance  passed  in  originally. 
nAllocatedSurface,  //  The  size  of  the  Tris  array  right  now. 
nPts,  //  Contains  the  number  of  points  currently  in  cut. 

Pts,  //  Contains  the  points  currently  in  the  cut. 

nTris,  //  Contains  the  number  of  triangles  currently  in  cut. 

Tris,  //  Contains  the  triangles  currently  in  the  cut. 

EFPts,  //  Contains  element  index  for  each  point  in  the  cut. 

nAllocatedElement,  //  The  size  of  the  Elem  array  right  now. 

nElem,  //  Number  of  elements  that  have  been  currently  cut. 

Elem,  //  The  elements  that  have  been  currently  cut. 

e,  //  The  element  that  the  algorithm  has  just  cut. 

f  1 ,  //  Flag  indicating  whether  face  1  of  e  has  been  cut. 

f2,  //  Flag  indicating  whether  face  2  of  e  has  been  cut. 

f3,  //  Flag  indicating  whether  face  3  of  e  has  been  cut. 

f4,  //  Flag  indicating  whether  face  4  of  e  has  been  cut. 

eVisited,  //  Contains  whether  elements  have  been  visited. 

inPts,  //  Contains  number  of  local  intersection  points. 

iPts,  //  Contains  local  intersection  points. 

inTris,  //  Contains  number  of  local  intersection  triangles. 

iTris,  //  Contains  local  intersection  triangles. 

gridlndex,  //  Contains  the  number  of  grids  being  cut. 

&nElements,  //  The  total  number  of  elements  in  all  grids. 

ppt,  //  Previous  intersection  points. 

pit,  //  Previous  intersection  indicators. 

pglndex,  //  Prev.  global  indices  in  Pts  for  int.  pts. 

pt,  //  Current  intersection  points. 

it,  //  Current  intersection  indicators. 

glndex,  //  Current  global  indices  in  Pts  for  int.  pts. 

ept,  //  Contains  all  int.  pts.  for  cut  elements. 

eit,  //  Contains  all  int.  indicators  for  cut  elements. 
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eglndex  );  //  Contains  all  global  indices  into  Pts  for  int.  pts. 
}  //  End  if  tetrahedra  intersects  the  cutting  plane. 

}  //  End  calculateCuttingPlane. 
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void  recursivelyCalculateCut(  DOUBLE.3D  point,  //  The  point  on  the  plane. 

DOUBLE  JD  normal,  //  The  normal  to  the  cutting  plane, 
double  tolerance,  //  User  defined  tolerance  passed  in  originally, 
int  *nAllocatedSurface,  //  The  size  of  the  Tris  array  right  now. 
int  *nPts,  //  Contains  number  of  points  currently  in  the  cut. 
DOUBLE  JD  **Pts,  //  Contains  the  points  currently  recorded, 
int  *nTris,  //  The  number  of  triangles  currently  in  the  cut. 

INT  .3D  **Tris,  //  Contains  the  triangles  currently  in  the  cut. 
INT_1D  **EFPts,  //  Contains  element  indices  for  each  cut  pt. 
INT_1D  *nAllocatedElement,  //  The  size  of  the  Elem  array  now. 
INT.ID  **nElem,  //  Number  of  elements  in  the  current  cut. 
INT.1D  **Elem,  //  Elements  in  the  current  cut. 
int  e,  //  Current  element. 

Boolean  fl,  //  Flag  indicating  whether  face  1  of  e  has  been  cut. 
Boolean  fi2,  //  Flag  indicating  whether  face  2  of  e  has  been  cut. 
Boolean  f3,  //  Flag  indicating  whether  face  3  of  e  has  been  cut. 
Boolean  f4,  //  Flag  indicating  whether  face  4  of  e  has  been  cut. 
Boolean  *visited,  //  Allocated  in  auxiliary  for  element  visited, 
int  inPts,  //  Contains  number  of  local  intersection  points. 
DOUBLE.3D  *iPts,  //  Contains  local  intersection  points, 
int  inTris,  //  Contains  number  of  local  intersection  triangles. 

INT _3D  *iTris,  //  Contains  local  intersection  triangles, 
int  gridlndex,  //  Number  of  grids  being  cut. 
int  *nElements,  //  The  total  number  of  elements  in  all  grids. 
DOUBLE_3D  ppt[6],  //  Previous  elements  intersection  points. 
INT-1D  pit[6],  //  Previous  elements  intersection  indicators. 
INT.ID  pglndex[6],  //  Previous  elements  global  point  indices. 
DOUBLE.3D  pt[6],  //  Current  intersection  points. 

INT-1D  it[6],  //  Current  intersection  indicators. 

INT-1D  glndex[6],  //  Current  global  point  indices. 
D0UBLE3D  ept[6],  //  Global  intersection  point  information. 
INT-1D  eit[6],  //  Global  intersection  indicator  information. 

INT _1D  eglndex[6] )  //  Global  index  information. 


{ 

Boolean 

Boolean 

Boolean 

Boolean 

int 

int 

int 

.ID 
.ID 
_1D 
.ID 


efl  =  False; 

ef2  =  False; 

ef3  =  False; 

ef4  =  False; 

i  =0; 

plndex  =  0; 

elndex  =  0; 

spit[6]; 
spglndex[6]; 
sit[6]; 
sglndex[6]; 


//  Local  variable  for  whether  face  1  cut. 

//  Local  variable  for  whether  face  2  cut. 

//  Local  variable  for  whether  face  3  cut. 

//  Local  variable  for  whether  face  4  cut. 

//  Local  counting  variable. 

//  Local  point  index  variable. 

//  Local  element  index  variable. 

//  Local  intersection  points. 

//  Local  previous  global  index  information. 
//  Local  intersection  indicators. 

//  Local  global  index  information. 
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double 

to 

=  0.0; 

//  Local  intersection  parameter  for  point  0. 

double 

tl 

=  0.0; 

//  Local  intersection  parameter  for  point  1 . 

double 

M 

=  0.0; 

//  Local  intersection  parameter  for  point  2. 

double 

t3 

=  0.0; 

//  Local  intersection  parameter  for  point  3. 

double 

tol 

=  0.0; 

//  Local  weighted  tolerance. 

DOUBLEJD 

p0; 

//  Local  point  0  variable. 

DOUBLEJD 

pi; 

//  Local  point  1  variable. 

DOUBLEJD 

p2; 

//  Local  point  2  variable. 

DOUBLEJD 

p3; 

//  Local  point  3  variable. 

DOUBLEJD 

sppt[6]; 

//  Local  previous  point  intersection  information. 

DOUBLEJD 

spt[6]; 

//  Local  current  point  intersection  information. 

if  (fl  and  eN[e][0]  >=  0  and  !(visited[eN[e][0]])) 
{ 


for  (i  =  0;  i  <  6;  i++) 

{ 

sppt[i][0]  =  ppt[i][0]; 
sppt[i][l]  =  ppt[i][l]; 
sppt[i][2]  =  ppt[i][2]; 
spit[i]  =  pit[i] ; 
spglndex[i]  =  pglndex[i]; 
spt[i][0]  =  pt[i][0]; 
spt[i][l]  =  pt[i][l]; 
spt[i][2]  =  pt[i][2]; 
sit[i]  =  it[i]; 
sglndex[i]  =  glndex[i]; 


//  For  all  potential  intersections, 

//  Record  local  copy  of  previous  point. 

//  Record  local  copy  of  previous  point. 

//  Record  local  copy  of  previous  point. 

//  Record  local  copy  of  previous  indicator. 

//  Record  local  copy  of  previous  global  indices. 
//  Record  local  copy  of  current  point. 

//  Record  local  copy  of  current  point. 

//  Record  local  copy  of  current  point. 

//  Record  local  copy  of  indicator. 

//  Record  local  copy  of  global  indices. 


} 


visited[eN[e][0]]  =  True;  //  Indicate  this  neighboring  element  has  been  visited. 


//  Set  the  values  for  point  1  of  face  1  of  element  e. 
p0[0]  =  coordinates[tetrahedra[eN[e][0]][0]][0]; 
p0[l]  =  coordinates[tetrahedra[eN[e][0]][0]][l]; 
p0[2]  =  coordinates[tetrahedra[eN[e][0]][0]][2]; 


//  Set  the  values  for  point  2  of  face  1  of  element  e. 
pi  [0]  =  coordinates[tetrahedra[eN[e][0]][l]][0]; 
pi  [1]  =  coordinates[tetrahedra[eN[e][0]][l]][l]; 
pi  [2]  =  coordinates[tetrahedra[eN[e][0]][l]][2]; 

//  Set  the  values  for  point  3  of  face  1  of  element  e. 
p2[0]  =  coordinates[tetrahedra[eN[e][0]][2]][0]; 
p2[l]  =  coordinates[tetrahedra[eN[e][0]][2]][l]; 


p2[2]  =  coordinates[tetrahedra[eN[e][0]][2]][2]; 

//  Set  the  values  for  point  4  of  face  1  of  element  e. 
p3[0]  =  coordinates  [tetrahedra[eN[e][0]]  [3]]  [0]; 
p3[l]  =  coordinates[tetrahedra[eN[e][0]][3]][l]; 
p3[2]  =  coordinates[tetrahedra[eN[e][0]][3]][2]; 

//  Compute  the  dot  product  of  vector  from  planar  point  to  point  0 

//  and  the  normal  to  the  plane. 

tO  =  (p0[0]  -  point[0])  *  normal[0]  + 

(p0[l]  -  point[l])  *  normal[l]  + 

(p0[2]  -  point[2])  *  normal[2]; 

//  Compute  the  dot  product  of  vector  from  planar  point  to  point  1 

//  and  the  normal  to  the  plane. 

tl  =  (p  1  [0]  -  pointfO])  *  normal[0]  + 

(pi  [1]  -  point[l])  *  normal[l]  + 

(pi  [2]  -  point[2])  *  normal[2]; 

//  Compute  the  dot  product  of  vector  from  planar  point  to  point  2 

//  and  the  normal  to  the  plane. 

t2  =  (p2[0]  -  point[0])  *  normal[0]  + 

(p2[l]  -  point[l])  *  normal[l]  + 

(p2[2]  -  point[2])  *  normal[2]; 

//  Compute  the  dot  product  of  vector  from  planar  point  to  point  3 

//  and  the  normal  to  the  plane. 

t3  =  (p3 [0]  -  point[0])  *  normalfO]  + 

(p3[l]  -  point[i P  *  normal[l]  + 

(p3[2]  -  point[2])  *  normal[2]; 

//  Base  tolerance  on  the  max  edge  length  and  machine  precision, 
tol  =  (getMaxEdgeLength(eN[e][0]))*tolerance; 

//  If  the  cutting  plane  intersects  this  tetrahedra, 
if  ( (tO  >=  -tol  and  tl  <=  tol)  or  (tO  <=  tol  and  tl  >=  -tol)  or 
(tO  >=  -tol  and  t2  <=  tol)  or  (tO  <=  tol  and  t2  >=  -tol)  or 

(tO  >=  -tol  and  t3  <=  tol)  or  (tO  <=  tol  and  t3  >=  -tol)  or 

(tl  >=  -tol  and  t2  <=  tol)  or  (tl  <=  tol  and  t2  >=  -tol)  or 

(tl  >=  -tol  and  t3  <=  tol)  or  (tl  <=  tol  and  t3  >=  -tol)  or 

(t2  >=  -tol  and  t3  <=  tol)  or  (t2  <=  tol  and  t3  >=  -tol) ) 

{ 

intersectPlane WithTetrahedra(  eN  [e]  [0] , 

point, 

normal. 
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pO, 

pl> 

P2, 

p3, 

to, 

tl, 

t2, 

t3, 

&efl, 

&ef2, 

&ef3, 

&ef4, 

tolerance, 

&inPts, 

&iPts, 

&inTris, 

&iTris, 

*nPts, 

spit, 

sppt, 

spglndex, 

sit, 

spt, 

sglndex, 

True, 

eN, 

e, 

0, 

eit, 

ept, 

eglndex ); 

//  If  the  memory  has  been  exceeded,  then  reallocate  the  memory. 

if  (*nAllocatedElement  ==  *nElements) 
reallocate  memory. 

//  Record  the  element  information  and  increment  the  number  of  elements. 

(*Elem)[(*nElements)++]  =  eN[e][0]; 

(*nElem)[gridIndex]++; 

//  If  the  surface  memory  has  been  exceeded,  then  reallocate  the  memory. 

if  (*nAllocatedSurface  ==  inTris) 
reallocate  memory. 


int  index  =  0; 


//  Index  variable. 
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int  beglndex  =  *nPts;  //  Save  starting  index  for  this  set  of  intersection  points. 

//  Cycle  over  all  possible  intersections, 
for  (i  =  0;  i  <  6;  i++) 

{ 

//  If  an  intersection  has  occurred  and  the  point  is  not  a  duplicate, 
if  (  sit[i]  ==  1  and  sglndex[i]  >=  beglndex  ) 

{ 

(*Pts)[*nPts][0]  =  iPts [index] [0];  //  Record  the  x  value  in  the  Pts  array. 
(*Pts)[*nPts][l]  =  iPts[index][l];  //  Record  the  y  value  in  the  Pts  array. 
(*Pts)[*nPts][2]  =  iPts[index][2];  //  Record  the  z  value  in  the  Pts  array. 
(*EFPts)[*nPts]  =  eN[e][0];  //  Record  the  element  for  this  point. 

(*nPts)++;  //  Increment  the  number  of  actual  points  in  the  cutting  plane. 
index++;  //  Increment  the  index. 

} 

} 

//  Cycle  over  all  triangles  resulting  from  the  intersection, 
for  (i  =  0;  i  <  inTris;  i++) 

{ 

(*Tris)[*nTris][0]  =  iTris[i][0];  //  Record  index  one  of  the  triangle. 
(*Tris)[*nTris][l]  =  iTris[i][l];  //  Record  index  two  of  the  triangle. 
(*Tris)[*nTris][2]  =  iTris[i][2];  //  Record  index  three  of  the  triangle. 
(*nTris)++;  //  Record  the  number  of  triangles. 

} 

elndex  =  (eN[e][0])*6;  //  Dereference  element  index. 

//  Cycle  over  all  possible  intersections, 
for  (i  =  0;  i  <  6;  i++) 

{ 

//  If  the  edge  is  intersected, 
if  (sit[i]> 

{ 

ept[elndex][0]  =  spt[i][0]; 
ept[elndex][l]  =  spt[i][l]; 
ept[elndex][2]  =  spt[i][2]; 
eit[elndex]  =  sit[i]; 
eglndex[elndex]  =  sglndex[i]; 

} 

elndex++; 


//  Record  the  x  value. 

//  Record  the  y  value. 

//  Record  the  z  value. 

//  Record  the  intersection  indicator. 
//  Record  the  global  index. 

//  Increment  the  index. 


recursivelyCalculateCut(  point,  //  The  point  on  the  cutting  plane. 

normal,  //  The  normal  to  the  cutting  plane. 
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tolerance,  //  User  defined  tolerance. 
nAllocatedSurface,  //  Amt  of  surface  info  memory. 
nPts,  //  Number  of  pts.  in  the  current  cutting  plane. 

Pts,  //  The  points  currently  in  the  cutting  plane. 
nTris,  //  Number  of  triangles  in  the  current  cutting  plane. 
Tris,  //  The  triangles  currently  in  the  cutting  plane. 

EFPts,  //  The  elements  for  each  point  in  the  cutting  plane. 

nAllocatedElement,  //  Amt  of  element  info  memory. 

nElem,  //  The  number  of  elements  in  the  cutting  plane. 

Elem,  //  Element  indices  intersected  by  the  cutting  plane. 

eN[e][0],  //  The  current  element. 

efl,  //  Whether  face  1  of  current  element  is  intersected. 

ef2,  //  Whether  face  2  of  current  element  is  intersected. 

ef3,  //  Whether  face  3  of  current  element  is  intersected. 

ef4,  //  Whether  face  4  of  current  element  is  intersected. 

visited,  //  Contains  visit  information  for  each  element. 

inPts,  //  Local  number  of  intersection  points. 

iPts,  //  Local  intersection  points. 

inTris,  //  Local  number  of  intersected  triangles. 

iTris,  //  Local  intersected  triangles. 

gridlndex,  //  Number  of  grids  being  cut. 

nElements,  //  Total  number  of  elements  in  all  grids. 

spt,  //  Previous  point  intersections. 

sit,  //  Previous  intersection  indicators. 

sglndex,  //  Previous  global  indices. 

spt,  //  Current  point  intersections. 

sit,  //  Current  intersection  indicators. 

sglndex,  //  Current  global  indices. 

ept,  //  Global  point  intersection  information. 

eit,  //  Global  intersection  indicator  information. 

eglndex );  //  Global  point  index  information. 

}  //  End  if  cutting  plane  intersects  this  tetrahedra. 

}  //End  if  (fl  and  eN[e][0]  >=  0  and  !(visited[eN[e][0]])) 
if  (f2  and  eN[e][l]  >=  0  and  !(visited[eN[e] [  1  ]])) 

//  Method  analogous  to  that  for  face  1 . 
if  (13  and  eN[e][2]  >=  0  and  !(visited[eN[e][2]])) 

//  Method  analogous  to  that  for  face  1 . 
if  (f4  and  eN[e][3]  >=  0  and  !(visited[eN[e][3]])) 

//  Method  analogous  to  that  for  face  1 . 

}  //  End  recursivelyCalculateCut. 


